[LIBREP] melodyplot: a waveplot in the piano-roll domain
See original GitHub issueWhat is a melodyplot?
A proprietary software named Melodyne, made by Celemony GmbH, has a very intuitive interface for displaying melodies. It sits right in between a wave plot, a pitch contour plot, and a piano-roll representation. Here’s a video demonstrating the tool:
https://www.youtube.com/watch?v=2v52AeU59N0
Why is a melodyplot useful for MIR?
Shortly put, the benefit of this visualization is that it superimposes three different time scales in the evolution of the melody: (1) the time scale of amplitude modulation is shown on the waveform plot (2) the time scale of frequency modulation is shown on the pitch contour plot (3) the time scale of musical events is shown as blocks in the piano-roll domain
One person who uses this tool for musicological research is @ethanhein at NYU.
How to use a melodyplot?
I am proposing that we add a visualization like this in librosa, mixing (1) and (3) — the pitch contour (2) can always be added on top by the user, as a simple line plot. The functional prototype would be
melodyplot(y, melody, sr=22050, hop_length=512, max_points=50000.0, x_axis='time', offset=0.0, max_sr=1000, ax=None, **kwargs)
Note that this prototype is the same as waveplot
, with the exception of melody
and hop_length
.
The positional argument melody
would be a one-dimensional np.ndarray
encoding the quantized pitch curve in Hertz. The responsibility of quantizing this pitch curve according to a given temperament would rely on the user.
Null values or np.nan
values would encode unvoiced portions of the melody. This is the standard in mir_eval
(https://github.com/craffel/mir_eval) and in the MedleyDB dataset.
The keyword argument hop_length
would be an integer, just as in piptrack
or stft
. It would specify the sample rate of the melody
signal as sr/hop_length
.
Advanced use cases
The waveform input y
could potentially be stereophonic. In this case, we would use the upper and lower part of the waveform to show the left and right channel—just as in a waveplot
.
Regarding waveform colors, i think that we should follow the default color cycle of Matplotlib 2.0, i.e. the Vega category 10 palette: first “#1f77b4” (blue), second “#ff7f0e” (orange), third “#2ca02c” (green), fourth “#d62728” (red), fifth “#9467bd” (purple), and so on. This would allow to display several voices on the same figure. Of course, this is always something that could be changed manually, by way of a color
kwarg that would be passed to fill_between
.
Long-term vision: YIN-based melodyplot
~Long term (i.e. after #527 is merged), we could potentially have melody="yin"
, so that the pitch curve is estimated automatically 😃 Then, displaying the melody from the first ten seconds of a file would be as terse as:~
~melodyplot(load(filename, duration=10)[0])
~
[EDIT: actually, the discussion below has shown that we should separate estimation (via YIN or otherwise) from display. But the possibility still holds. It would be three lines of code (load+YIN+melodyplot), not two.]
Calibrating the thickness of the waveform
From an algorithmic standpoint, one potential difficulty of melodyplot
is that the amplitude of y
needs to be rescaled to that its apparent vertical thickness is comparable to the pitch range of the melody itself. In the Melody software, that thickness is set to one semitone in twelve-tone equal temperament. This could be our default, but it would be nice to be able to cover other use cases: non-Western temperament (ref #641 !) as well as non-tempered melodies, such as bioacoustic sounds. There is potentially room for a keyword argument here. The name of this keyword argument, as well as its physical unit (cough #925 cough), would need some bikeshedding.
Conclusion
librosa.display.melodyplot
is a display function which shows a waveform y
in the piano-roll domain, in the style of the Melodyne proprietary software, according to some pre-computed quantized pitch curve melody
. I can only assume that this feature would be a huge improvement for doing computational musicology in Python.
Thank you for having read me up until the end. Please let me know what you think! Vincent.
Post-scriptum
LIBREP means “Librosa Enhancement Proposal”. Just an acronym i came up with.
Issue Analytics
- State:
- Created 4 years ago
- Comments:25 (25 by maintainers)
Top GitHub Comments
I think I won’t have time to put this in 0.8.0. Punting to 0.8.1 unless someone else wants to pick it up in the immediate future.
@bmcfee Here is the code: https://gist.github.com/lostanlen/6abbfdf505df1779e7b67940e0631057