question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[LIBREP] melodyplot: a waveplot in the piano-roll domain

See original GitHub issue

What is a melodyplot?

A proprietary software named Melodyne, made by Celemony GmbH, has a very intuitive interface for displaying melodies. It sits right in between a wave plot, a pitch contour plot, and a piano-roll representation. Here’s a video demonstrating the tool:

https://www.youtube.com/watch?v=2v52AeU59N0

 

Why is a melodyplot useful for MIR?

Shortly put, the benefit of this visualization is that it superimposes three different time scales in the evolution of the melody: (1) the time scale of amplitude modulation is shown on the waveform plot (2) the time scale of frequency modulation is shown on the pitch contour plot (3) the time scale of musical events is shown as blocks in the piano-roll domain

One person who uses this tool for musicological research is @ethanhein at NYU.

 

How to use a melodyplot?

I am proposing that we add a visualization like this in librosa, mixing (1) and (3) — the pitch contour (2) can always be added on top by the user, as a simple line plot. The functional prototype would be

melodyplot(y, melody, sr=22050, hop_length=512, max_points=50000.0, x_axis='time', offset=0.0, max_sr=1000, ax=None, **kwargs)

Note that this prototype is the same as waveplot, with the exception of melody and hop_length.

The positional argument melody would be a one-dimensional np.ndarray encoding the quantized pitch curve in Hertz. The responsibility of quantizing this pitch curve according to a given temperament would rely on the user.

Null values or np.nan values would encode unvoiced portions of the melody. This is the standard in mir_eval (https://github.com/craffel/mir_eval) and in the MedleyDB dataset.

The keyword argument hop_length would be an integer, just as in piptrack or stft. It would specify the sample rate of the melody signal as sr/hop_length.

 

Advanced use cases

The waveform input y could potentially be stereophonic. In this case, we would use the upper and lower part of the waveform to show the left and right channel—just as in a waveplot.

Regarding waveform colors, i think that we should follow the default color cycle of Matplotlib 2.0, i.e. the Vega category 10 palette: first “#1f77b4” (blue), second “#ff7f0e” (orange), third “#2ca02c” (green), fourth “#d62728” (red), fifth “#9467bd” (purple), and so on. This would allow to display several voices on the same figure. Of course, this is always something that could be changed manually, by way of a color kwarg that would be passed to fill_between.

 

Long-term vision: YIN-based melodyplot

~Long term (i.e. after #527 is merged), we could potentially have melody="yin", so that the pitch curve is estimated automatically 😃 Then, displaying the melody from the first ten seconds of a file would be as terse as:~

~melodyplot(load(filename, duration=10)[0])~

[EDIT: actually, the discussion below has shown that we should separate estimation (via YIN or otherwise) from display. But the possibility still holds. It would be three lines of code (load+YIN+melodyplot), not two.]

 

Calibrating the thickness of the waveform

From an algorithmic standpoint, one potential difficulty of melodyplot is that the amplitude of y needs to be rescaled to that its apparent vertical thickness is comparable to the pitch range of the melody itself. In the Melody software, that thickness is set to one semitone in twelve-tone equal temperament. This could be our default, but it would be nice to be able to cover other use cases: non-Western temperament (ref #641 !) as well as non-tempered melodies, such as bioacoustic sounds. There is potentially room for a keyword argument here. The name of this keyword argument, as well as its physical unit (cough #925 cough), would need some bikeshedding.

 

Conclusion

librosa.display.melodyplot is a display function which shows a waveform y in the piano-roll domain, in the style of the Melodyne proprietary software, according to some pre-computed quantized pitch curve melody. I can only assume that this feature would be a huge improvement for doing computational musicology in Python.

Thank you for having read me up until the end. Please let me know what you think! Vincent.

 

Post-scriptum

LIBREP means “Librosa Enhancement Proposal”. Just an acronym i came up with.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:25 (25 by maintainers)

github_iconTop GitHub Comments

1reaction
bmcfeecommented, Jul 8, 2020

I think I won’t have time to put this in 0.8.0. Punting to 0.8.1 unless someone else wants to pick it up in the immediate future.

Read more comments on GitHub >

github_iconTop Results From Across the Web

No results found

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found