Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

icqt outputs noise with unusual format

See original GitHub issue

Description

icqt (inverse CQT) return loud noise with unusual format when I simply convert “.wav => CQT =>.wav”.
I cannot open output file with Windows10’s Groove music app (iSTFT result can be opened), so I try to open with Audacity.
Audacity enable opening the icqt output .wav file, but the result is loud noise.

Steps/Code to Reproduce

Even example of official document return unintended results.

import librosa

y, sr = librosa.load(librosa.util.example_audio_file(), duration=15)
C = librosa.cqt(y, sr=sr)
y_hat = librosa.icqt(C, sr=sr)

librosa.output.write_wav("./input.wav", y, sr, norm=True)
librosa.output.write_wav("./reconstructed.wav", y_hat, sr, norm=True)

basically same as official sample code

Expected Results

Above code is simply “wave => CQT => wave” conversion, so I expected to be almost all same wav file.
Sound could be degraded only a little, but should not be a noise.

Actual Results

Loud noise. No music info.
At the same time, SciPy warning displayed.

FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`.    
In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.

Versions

Windows-10-10.0.17134-SP0 Python 3.7.1 (v3.7.1:260ec2c36a, Oct 20 2018, 14:57:15) [MSC v.1915 64 bit (AMD64)] NumPy 1.15.4 SciPy 1.1.0 librosa 0.6.2

Issue Analytics

State:
Created 5 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

3reactions

bmcfeecommented, Mar 19, 2019

One more thing you might try: in icqt, set amin=1e-2 (rather than the default of 1e-6). This controls the silence threshold in the inverse window normalization, which I suspect is set to be far too low by default. Bringing it up to 1e-2 gets the default analysis parameters in the ballpark of something listenable (though it’s still heavily distorted, and the above suggestions will improve things).

It’s possible that we could do something much smarter in setting the window inversion threshold, and that could improve things across the board. I’ll stick this on the docket for 0.7.

3reactions

bmcfeecommented, Mar 19, 2019

I looked into this today, and what’s happening is that the default CQT analysis parameters are just not invertible. The problem ultimately boils down to the fact that the filters are too short relative to the default hop length, which produces (time) gaps in the analysis at higher frequency bands. These gaps become numerically unstable when it comes time to invert the CQT, hence the noise / buzz. I don’t think this is avoidable with the current default parameters, but we can certainly make the icqt method a bit smarter at alerting the user when the parameters are not going to produce a faithful inversion.

That said, here are a few potential workarounds for you:

Use a shorter hop length. Instead of 512, try going down to 256 or 128. This will ensure proper coverage between frames.
Use more frequency bands per octave. By default, we use 12, but if you go up to 36 (3 bins per semitone), the filters necessarily become longer (in time), which again produces better time coverage and inversion. If you do this, remember to increase n_bins as well. With default sampling rate and hop length, I get decent reconstruction with 72 bins per octave. If you drop the hop length to 256, you should get decent reconstruction at 36 bins per octave,

Top Results From Across the Web

QAudioOutput strange peak sound at beginning in PyQt4

Ah, the noise sound at the beginning disappears if there is self.file.seek(44). after self.file.open(...) What is this "seek"-thing for?

LTFAT Reference manual

If one of the input parameters is an array, all the output parameters will ... The WMDCT is sometimes known as an odd-stacked...

Noisy Real Hardware - Noise in Quantum Computers - Part 1

What you need to know before you run anything on a real Quantum processor.Lecture: Noisy Real Hardware - Noise in Quantum Computers -...

60 GHz transmitter circuits in 65nm CMOS | Request PDF

This work presents fundamental building blocks for a 60 GHz transmitter front-end. The circuits are implemented in a 65 nm bulk CMOS technology, ......

Untitled - Springer

7 Moment-Based Estimation of the Signal-to-Noise ... construction of interleaver Πk, where no output index pair can be assigned.