Automatic detection of musical events in an audio signal is one of the most fundamental tasks in music information retrieval. Here, we will show how to detect an onset, the very instant that marks the beginning of the transient part of a sound, or the earliest moment at which a transient can be reliably detected.
For more reading:
Load an audio file into the NumPy array x
and sampling rate sr
.
x, sr = librosa.load('audio/classic_rock_beat.wav')
librosa.onset.onset_detect
¶librosa.onset.onset_detect
works in the following way:
Compute the frame indices for estimated onsets in a signal:
onset_frames = librosa.onset.onset_detect(x, sr=sr, wait=1, pre_avg=1, post_avg=1, pre_max=1, post_max=1)
print(onset_frames) # frame numbers of estimated onsets
Convert onsets to units of seconds:
onset_times = librosa.frames_to_time(onset_frames)
print(onset_times)
Plot the onsets on top of a spectrogram of the audio:
S = librosa.stft(x)
logS = librosa.amplitude_to_db(abs(S))
Let's also plot the onsets with the time-domain waveform.
We can add a click at the location of each detected onset.
clicks = librosa.clicks(frames=onset_frames, sr=sr, length=len(x))
Listen to the original audio plus the detected onsets. One way is to add the signals together, sample-wise:
ipd.Audio(x + clicks, rate=sr)
Another method is to play the original track in one stereo channel and the click track in the other stereo channel:
ipd.Audio(numpy.vstack([x, clicks]), rate=sr)
You can also change the click to a custom audio file instead:
cowbell, _ = librosa.load('audio/cowbell.wav')
More cowbell?
clicks = librosa.clicks(frames=onset_frames, sr=sr, length=len(x), click=cowbell)
ipd.Audio(x + clicks, rate=sr)
In librosa.onset.onset_detect
, use the backtrack=True
parameter. What does that do, and how does it affect the detected onsets? (See librosa.onset.onset_backtrack
.)
In librosa.onset.onset_detect
, you can use the keyword parameters found in librosa.util.peak_pick
, e.g. pre_max
, post_max
, pre_avg
, post_avg
, delta
, and wait
, to control the peak picking algorithm. Adjust these parameters. How does it affect the detected onsets?