Tempogram ratio and f0 harmonic interpolation
See original GitHub issueIs your feature request related to a problem? Please describe.
This was alluded to in #1426 , but it would be handy to finally provide an implementation of the tempogram ratio feature from (Peeters, 2005). It would look something like the following (bottom subplot):
- Peeters, Geoffroy. “Rhythm Classification Using Spectral Rhythm Patterns.” ISMIR. 2005.
Describe the solution you’d like
The basic idea is to take a tempogram, extract a (time-varying) tempo estimate (corresponding to quarter-notes), and then use harmonic interpolation to measure tempogram energy for each frame at all musically important durations. The benefit of this over a raw tempogram is that it could locally normalize for tempo variation.
The underlying algorithm is somewhat similar to our interp_harmonics
function, except that we want to pull out a different subset of frequencies for each frame. I imagine the implementation would use a vectorized interpolator in a similar fashion to what we do for reassigned spectrogram harmonics:
but of course the details will be slightly different.
We could also support having a single, global tempo (much simpler), as well as aggregation over frames.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
I am trying to understand how your idea is different from interp_harmonics, I am not familiar with that function. What I can say is that, in my case, I was trying to deconvolve a log-spectrum into some sort of energy-normalized pitch component and a pitch-normalized energy component. Since that energy component is pitch-normalized, you don’t need to estimate the f0 and you can find the energy of the harmonics easily then (works better in monophonic cases), hence the idea of using it to derive a simple timbre descriptor. Would you need to provide the f0 in your case then?
Poring over some old discussions, I just realized that this proposed functionality could also be useful in some unexpected ways. If f0 is fixed to a tonal center frequency (over all time), and the “harmonics” are allowed to be fractions (intervals, no technical reason to forbid this), then we can do pitch salience histograms as well as described here: https://github.com/librosa/librosa/issues/641#issuecomment-636593736