import stanford_mir; stanford_mir.init()
We will mainly use two libraries for audio acquisition and playback:
librosa
is a Python package for music and audio processing by Brian McFee. A large portion was ported from Dan Ellis's Matlab audio processing examples.
IPython.display.Audio
lets you play audio directly in an IPython notebook.
This GitHub repository includes many short audio excerpts for your convenience.
Here are the files currently in the audio
directory:
ls audio
Visit https://ccrma.stanford.edu/workshops/mir2014/audio/ for more audio files.
Use librosa.load
to load an audio file into an audio array. Return both the audio array as well as the sample rate:
import librosa
x, sr = librosa.load('audio/simple_loop.wav')
If you receive an error with librosa.load
, you may need to install ffmpeg.
Display the length of the audio array and sample rate:
print(x.shape)
print(sr)
In order to display plots inside the Jupyter notebook, run the following commands, preferably at the top of your notebook:
%matplotlib inline
import matplotlib.pyplot as plt
import librosa.display
Plot the audio array using librosa.display.waveplot
:
plt.figure(figsize=(14, 5))
librosa.display.waveplot(x, sr=sr)
Display a spectrogram using librosa.display.specshow
:
X = librosa.stft(x)
Xdb = librosa.amplitude_to_db(abs(X))
plt.figure(figsize=(14, 5))
librosa.display.specshow(Xdb, sr=sr, x_axis='time', y_axis='hz')
IPython.display.Audio
¶Using IPython.display.Audio
, you can play an audio file:
import IPython.display as ipd
ipd.Audio('audio/conga_groove.wav') # load a local WAV file
Audio
can also accept a NumPy array. Let's synthesize a pure tone at 440 Hz:
import numpy
sr = 22050 # sample rate
T = 2.0 # seconds
t = numpy.linspace(0, T, int(T*sr), endpoint=False) # time variable
x = 0.5*numpy.sin(2*numpy.pi*440*t) # pure sine wave at 440 Hz
Listen to the audio array:
ipd.Audio(x, rate=sr) # load a NumPy array
librosa.output.write_wav
saves a NumPy array to a WAV file.
librosa.output.write_wav('audio/tone_440.wav', x, sr)