Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`librosa.effects.trim` bug after 0.9 release

See original GitHub issue

BEFORE POSTING A BUG REPORT Please look through existing issues (both open and closed) to see if it’s already been reported or fixed!

Describe the bug librosa.effects.trim seems to work incorrectly with some coefficient sets. I’m running the test on this file: https://github.com/NVIDIA/DALI_extra/blob/main/db/audio/wav/237-134500-0024.wav (it’s a part of LibriSpeech dataset). I’m attaching the repro below. Looking at the sample in Audacity, it looks like the 0.8.1 version is better: silence starts around 0.3s, which with sample_rate=16000 is near the 5483th sample, unlike in the 0.9 version - 235th sample.

obraz

To Reproduce With librosa 0.8.1 (this is I believe the correct result):

In [1]: import soundfile as sf
   ...:     ...:
   ...:     ...: data, samplerate = sf.read('237-134500-0024.wav')

In [2]: yt,index=librosa.effects.trim(y=data, top_db=-10, ref=.0003, frame_length=512, hop_length=1)

In [3]: yt
Out[3]:
array([ 0.02252197,  0.01318359,  0.00125122, ..., -0.03158569,
       -0.01086426,  0.012146  ])

In [4]: yt.shape
Out[4]: (53132,)

In [5]: index
Out[5]: array([ 5483, 58615])

With librosa 0.9:

In [1]: import soundfile as sf
   ...:     ...:
   ...:     ...: data, samplerate = sf.read('237-134500-0024.wav')

In [14]: yt, index = librosa.effects.trim(y=data, top_db=-10, ref=.0003, frame_length=512, hop_length=1)

In [15]: yt
Out[15]:
array([-0.00054932, -0.00027466, -0.00085449, ..., -0.00112915,
       -0.00222778, -0.00192261])

In [16]: yt.shape
Out[16]: (70169,)

In [17]: index
Out[17]: array([  235, 70404])

Expected behavior I believe the 0.8.1 behaviour is correct, while the 0.9 is not.

Screenshots Attached

Software versions* Regression between librosa 0.8.1 and librosa 0.9.

Linux-5.4.0-92-generic-x86_64-with-glibc2.29
Python 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0]
NumPy 1.21.5
SciPy 1.8.0
librosa 0.8.1
INSTALLED VERSIONS
------------------
python: 3.8.10 (default, Nov 26 2021, 20:14:08)
[GCC 9.3.0]

librosa: 0.8.1

audioread: 2.1.9
numpy: 1.21.5
scipy: 1.8.0
sklearn: 1.0.2
joblib: 1.1.0
decorator: 5.1.1
soundfile: 0.10.3
resampy: 0.2.2
numba: 0.55.1

numpydoc: None
sphinx: 3.4.3
sphinx_rtd_theme: 0.5.1
sphinxcontrib.versioning: None
sphinx-gallery: None
pytest: None
pytest-mpl: None
pytest-cov: None
matplotlib: 3.3.3
presets: None

Additional context Add any other context about the problem here.

Issue Analytics

State:
Created 2 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

1reaction

bmcfeecommented, Feb 9, 2022

I just wanted to highlight that even projects run by small teams with limited resources can be widely adopted and used as a reference.

Thanks, and I do appreciate that!

It might be worth adding a bit here to explain why this change didn’t go through a deprecation cycle, just for future reference.

First, we make no claims to semantic versioning compliance, though we do aim for it whenever possible. That said, we are still in the 0.x stage (i know, it’s been 9 years…) and reserve the right to violate backward compatibility when necessary. But we do try to be good citizens about it and use deprecation cycles. We’re planning to move to 1.0 next year, at which point things will change and we’ll be more rigid about breaking changes. For now though, I’m trying to push in as many breaking changes as I can before it becomes more difficult to do so.

Now, in this case, the ref parameter is overloaded in a variety of ways, and the behavior tracks its use in the *_to_db functions. Note specifically that it is not called ref_power - maybe it should have been, but it wasn’t. Had it been called ref_power, then we would have done a deprecation rename and changed behavior in the 0.10 release. But, since the units and interpretation are not implied by the name, and there’s otherwise no way to distinguish a power number from an amplitude number, there’s not a clear solution here.

For the numerical stability reason given above, I think it’s unambiguously better to use amplitude for this threshold than power, so it’s a change that we should have made anyway. Had I remembered to update the docstring, this would be less of an issue. Yes, code would still have broken, but it would have been easier to see why.

0reactions

JanuszLcommented, Feb 9, 2022

@bmcfee,

Please don’t understand me wrong. I really appreciate your work and the time you dedicate to developing this great library. Indeed we found this problem during our internal test and I’m completely for testing any external dependency the project may pull in, especially if it is used in a bigger product. I just wanted to highlight that even projects run by small teams with limited resources can be widely adopted and used as a reference.

Top Results From Across the Web

Changelog — librosa 0.9.1 documentation

#1448 Documentation for librosa.effects.trim and librosa.effects.split has been corrected to reflect a semantic change in the ref parameter introduced by ...

audiomentations - PyPI

Python version support PyPI version Number of downloads from PyPI per ... Trim leading and trailing silence from an audio signal using librosa.effects.trim...

BirdCLEF:LIBROSA Audio Feature Extraction - Kaggle

audio_file, _ = librosa.effects.trim(y) print('Audio File:', audio_file, ... The following code depicts the waveform visualization of the amplitude vs the ...

librosa Changelog - pyup.io

`1493`_ Fixed a bug in `librosa.effects.split` when applied to multichannel ... The 0.9.0 release introduced restrictions on positional arguments to many ...

torchaudio.functional - PyTorch

Design a bass tone-control effect. Similar to SoX implementation. Parameters. waveform (Tensor) – audio waveform of dimension of (…, time).