Evaluation using mir_eval#

import matplotlib.pyplot as plt
import librosa
import mir_eval
import numpy

from mirdotcom import mirdotcom

mirdotcom.init()

mir_eval (documentation, paper) is a Python library containing evaluation functions for a variety of common audio and music processing tasks.

mir_eval was primarily created by Colin Raffel. This notebook was created by Brian McFee and edited by Steve Tjoa.

Why mir_eval?#

Most tasks in MIR are complicated. Evaluation is also complicated!

Any given task has many ways to evaluate a system. There is no one right away.

For example, here are issues to consider when choosing an evaluation method:

  • event matching

  • time padding

  • tolerance windows

  • vocabulary alignment

mir_eval tasks and submodules#

  • onset, tempo, beat

  • chord, key

  • melody, multipitch

  • transcription

  • segment, hierarchy, pattern

  • separation (like bss_eval in Matlab)

Install mir_eval#

pip install mir_eval

If that doesn’t work:

pip install --no-deps mir_eval

Example: Onset Detection#

filename = mirdotcom.get_audio("simple_piano.wav")
y, sr = librosa.load(filename)
# Estimate onsets.
est_onsets = librosa.onset.onset_detect(y=y, sr=sr, units="time")
est_onsets
# Load the reference annotation.
ref_onsets = numpy.array([0.1, 0.21, 0.3])
mir_eval.onset.evaluate(ref_onsets, est_onsets)

mir_eval finds the largest feasible set of matches using the Hopcroft-Karp algorithm.

Example: Beat Tracking#

est_tempo, est_beats = librosa.beat.beat_track(y=y, sr=sr)
est_beats = librosa.frames_to_time(est_beats, sr=sr)
est_beats
# Load the reference annotation.
ref_beats = numpy.array([0.53, 1.02])
mir_eval.beat.evaluate(ref_beats, est_beats)

Example: Chord Estimation#

# mir_eval.chord.evaluate()

Hidden benefits

  • Input validation! Many errors can be traced back to ill-formatted data.

  • Standardized behavior, full test coverage.

More than metrics#

mir_eval has tools for display and sonification.

import librosa.display
import mir_eval.display

Common plots: events, labeled_intervals

pitch, multipitch, piano_roll segments, hierarchy, separation

Example: Events#

S = librosa.feature.melspectrogram(y=y, sr=sr)
librosa.display.specshow(S, x_axis="time", y_axis="mel")
mir_eval.display.events(ref_beats, color="w", alpha=0.8, linewidth=3)
mir_eval.display.events(est_beats, color="c", alpha=0.8, linewidth=3, linestyle="--")

Example: Labeled Intervals#

Example: Source Separation#

y_harm, y_perc = librosa.effects.hpss(y, margin=8)
plt.figure(figsize=(12, 4))
mir_eval.display.separation([y_perc, y_harm], sr, labels=["percussive", "harmonic"])
plt.legend()