Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect size of mel spectrogram

See original GitHub issue

Hi,

I compute the mel spectrogram on a time-domain signal that has 13230080 samples, like so:

mel_spectrogram = librosa.feature.melspectrogram(audio_data, sr=44100, n_fft=2048, hop_length=512)

The resulting shape of the mel_spectrogram is (128, 25841), however as far as I understand it should be (128, 25840) (n_mel, length of time domain signal / hope size). For some reason it has one extra frame.

Can you please explain? Thanks!

Issue Analytics

State:
Created 7 years ago
Comments:11 (5 by maintainers)

Top GitHub Comments

2reactions

bmcfeecommented, Mar 17, 2017

why is it 1 + (n - n_fft) in the numerator?

It’s (n - n_fft) in the numerator.

Working backwards, say you have a maximum frame number T and that frames are left-aligned. For the last frame to be contained in the signal, you need T * hop_length + n_fft < n. Rearranging terms, you get T < (n - n_fft) / hop_length.

The 1+ is there to handle the zero-hop case (ie the first frame).

1reaction

nevosegalcommented, Mar 17, 2017

Right, I computed it incorrectly, got it now. Thanks again! 💯