# Onset Detection¶

Automatic detection of musical events in an audio signal is one of the most fundamental tasks in music information retrieval. Here, we will show how to detect an onset, the very instant that marks the beginning of the transient part of a sound, or the earliest moment at which a transient can be reliably detected.

Load an audio file into the NumPy array x and sampling rate sr.

In [3]:
x, sr = librosa.load('audio/classic_rock_beat.wav')

Out[4]:
Out[5]:
<matplotlib.collections.PolyCollection at 0x11365c358>

## librosa.onset.onset_detect¶

librosa.onset.onset_detect works in the following way:

1. Compute a spectral novelty function.
2. Find peaks in the spectral novelty function.
3. [optional] Backtrack from each peak to a preceding local minimum. Backtracking can be useful for finding segmentation points such that the onset occurs shortly after the beginning of the segment.

Compute the frame indices for estimated onsets in a signal:

In [6]:
onset_frames = librosa.onset.onset_detect(x, sr=sr, wait=1, pre_avg=1, post_avg=1, pre_max=1, post_max=1)
print(onset_frames) # frame numbers of estimated onsets

[ 20  29  38  57  65  75  84  93 103 112 121 131 140 148 158 167 176 185
213 232 241 250 260 268 278 288]


Convert onsets to units of seconds:

In [7]:
onset_times = librosa.frames_to_time(onset_frames)
print(onset_times)

[0.46439909 0.67337868 0.88235828 1.32353741 1.50929705 1.7414966
1.95047619 2.15945578 2.39165533 2.60063492 2.80961451 3.04181406
3.25079365 3.43655329 3.66875283 3.87773243 4.08671202 4.29569161
4.94585034 5.38702948 5.59600907 5.80498866 6.03718821 6.22294785
6.45514739 6.68734694]


Plot the onsets on top of a spectrogram of the audio:

In [8]:
S = librosa.stft(x)
logS = librosa.amplitude_to_db(abs(S))

Out[9]:
<matplotlib.collections.LineCollection at 0x1133c26a0>

Let's also plot the onsets with the time-domain waveform.

Out[10]:
<matplotlib.collections.LineCollection at 0x113349a58>

## librosa.clicks¶

We can add a click at the location of each detected onset.

In [11]:
clicks = librosa.clicks(frames=onset_frames, sr=sr, length=len(x))


Listen to the original audio plus the detected onsets. One way is to add the signals together, sample-wise:

In [12]:
ipd.Audio(x + clicks, rate=sr)

Out[12]: