Jupyter Audio Basics#

from mirdotcom import mirdotcom

mirdotcom.init()

Audio Libraries#

We will mainly use two libraries for audio acquisition and playback:

1. librosa#

librosa is a Python package for music and audio processing by Brian McFee. A large portion was ported from Dan Ellis’s Matlab audio processing examples.

2. IPython.display.Audio#

IPython.display.Audio lets you play audio directly in an IPython notebook.

Included Audio Data#

This GitHub repository includes many short audio excerpts for your convenience.

Here are the files currently in the audio directory:

mirdotcom.list_audio()
simple_piano.wav
latin_groove.mp3
clarinet_c6.wav
cowbell.wav
classic_rock_beat.mp3
oboe_c6.wav
sir_duke_trumpet_fast.mp3
sir_duke_trumpet_slow.mp3
jangle_pop.mp3
125_bounce.wav
brahms_hungarian_dance_5.mp3
58bpm.wav
conga_groove.wav
funk_groove.mp3
tone_440.wav
sir_duke_piano_fast.mp3
thx_original.mp3
simple_loop.wav
classic_rock_beat.wav
c_strum.wav
prelude_cmaj.wav
sir_duke_piano_slow.mp3
busta_rhymes_hits_for_days.mp3

Visit https://ccrma.stanford.edu/workshops/mir2014/audio/ for more audio files.

Reading Audio#

Use librosa.load to load an audio file into an audio array. Return both the audio array as well as the sample rate:

import librosa

fp = mirdotcom.get_audio("simple_loop.wav")
x, sr = librosa.load(fp)

If you receive an error with librosa.load, you may need to install ffmpeg.

Display the length of the audio array and sample rate:

print(x.shape)
print(sr)
(49613,)
22050

Visualizing Audio#

In order to display plots inside the Jupyter notebook, run the following commands, preferably at the top of your notebook:

%matplotlib inline
import matplotlib.pyplot as plt
import librosa.display

Plot the audio array using librosa.display.waveshow:

plt.figure(figsize=(14, 5))
librosa.display.waveshow(x, sr=sr)
<librosa.display.AdaptiveWaveplot at 0x7df465c59810>
../../_images/0fb950acdc0863e0ac35ac6172ac03ba56dcbe871fb4a0aa842a45f0d63031f3.png

Display a spectrogram using librosa.display.specshow:

X = librosa.stft(x)
Xdb = librosa.amplitude_to_db(abs(X))
plt.figure(figsize=(14, 5))
librosa.display.specshow(Xdb, sr=sr, x_axis="time", y_axis="hz")
<matplotlib.collections.QuadMesh at 0x7df4657620e0>
../../_images/0997c815ff6894178d6929a669030c28e3806c0269bb0c1a88c59e4c27910060.png

Playing Audio#

IPython.display.Audio#

Using IPython.display.Audio, you can play an audio file:

import IPython.display as ipd

fp = mirdotcom.get_audio("conga_groove.wav")
ipd.Audio(fp)  # load a local WAV file

Audio can also accept a NumPy array. Let’s synthesize a pure tone at 440 Hz:

import numpy

sr = 22050  # sample rate
T = 2.0  # seconds
t = numpy.linspace(0, T, int(T * sr), endpoint=False)  # time variable
x = 0.5 * numpy.sin(2 * numpy.pi * 440 * t)  # pure sine wave at 440 Hz

Listen to the audio array:

ipd.Audio(x, rate=sr)  # load a NumPy array

Writing Audio#

soundfile.write saves a NumPy array to a WAV file.

import soundfile

fp = mirdotcom.get_audio("tone_440.wav")
soundfile.write(fp, x, sr)