Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Dithering constant

See original GitHub issue

Why do torchaudio.compliance.kaldi.fbank and torchaudio.compliance.kaldi.spectrogram have so large dither default parameter (=1.0)? It very often just noises full output.

It’s common to use dither around 0, e.g 0.00001 in QuartzNet, Jasper – near to SOTA ASR models (https://github.com/NVIDIA/NeMo/blob/master/examples/asr/configs/quartznet15x5.yaml).

I want to notice that even in torchaudio tutorial we have dither = 0.0: https://pytorch.org/tutorials/beginner/audio_preprocessing_tutorial.html.

Also look at this issue and how it was resolved: https://github.com/pytorch/audio/issues/157

Issue Analytics

State:
Created 4 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

6reactions

csukuangfjcommented, May 8, 2020

Why do torchaudio.compliance.kaldi.fbank and torchaudio.compliance.kaldi.spectrogram have so large dither default parameter (=1.0)

Kaldi uses 1 as the default dither value. It is fine for Kaldi because waveform in kaldi has a range [-32768, 32767]. 1 is relatively small compared to the maximum value 32767.

However, in torchaudio,

torchaudio.load(filename)

returns a tensor with values in the range [-1, 1]. So if you still use the default value 1 from Kaldi, you will distort the audio signal.

1reaction

popcornellcommented, Feb 29, 2020

Second this, the default right now makes the whole torchaudio.compliace.kaldi features totally unusable out-of-the-box. I spent one hour looking at possible bugs on labels only to find out that basically my model was fed noise because of the dither default value.

Top Results From Across the Web

Dither - Wikipedia

Dither is an intentionally applied form of noise used to randomize quantization error, preventing large-scale patterns such as color banding in images.

Dithering part one – simple quantization | Bart Wronski

Dithering can be defined as intentional / deliberate adding of some noise to signal to ... Dithering quantization of a constant signal.

What is dithering in audio? When to dither and how it works

Dithering helps keep digital audio sounding great, even when some data ... low) level of constant noise, it turns out that its character...

What is Dithering: The Ultimate Guide for Beginners - eMastered

Dither will sound like some variation of white noise (a soft, consistent, hiss). Can you hear a difference between 16-bit audio and 24-bit...

Dithering Explained: What it is, When to Use It, and Why it's ...

Also, for MP3 don't worry about the dither. After export of the final 16bit master WAV file just make sure to keep the...