CQT2010 problematic output
See original GitHub issueIt seems I messed up something when updating nnAudio from 0.1.15
to 0.2.0
.
The output for CQT2010
is very different from CQT2010v2
. I suspect something is wrong during downsampling. But I don’t have time to debug at the moment, will post this as an issue to reminder me later. Or if anyone knows the solution to this problem, a pull request is welcome.
The following code produces the above-mentioned issue. The code below is using nnAudio 0.2.2
import torch
import torch.nn as nn
from torch.nn.functional import conv1d, conv2d
import numpy as np
import torch
from time import time
import math
from scipy.signal import get_window
from scipy import signal
from scipy import fft
import warnings
from torch.nn.functional import fold, unfold
import nnAudio.Spectrogram as Spectrogram_old
from scipy.signal import chirp, sweep_poly
from nnAudio import Spectrogram
# Linear sweep case
fs = 44100
t = 1
f0 = 55
f1 = 22050
s = np.linspace(0, t, fs*t)
x = chirp(s, f0, 1, f1, method='linear')
x = x.astype(dtype=np.float32)
device='cpu'
n_bins = 100
bins_per_octave=12
window = 'hann'
filter_scale = 2
# window='hann'
normalization_type = 'wrap'
# Complex
stft2 = Spectrogram.CQT2010v2(sr=fs, fmin=f0, filter_scale=filter_scale,
n_bins=n_bins, bins_per_octave=bins_per_octave, window=window)
X2 = stft2(torch.tensor(x, device=device).unsqueeze(0), normalization_type=normalization_type)
X2 = torch.log(X2 + 1e-2)
# np.save("tests/ground-truths/linear-sweep-cqt-2010-mag-ground-truth", X.cpu())
X3 = librosa.cqt(x, sr=fs, fmin=f0, filter_scale=filter_scale,
n_bins=n_bins, bins_per_octave=bins_per_octave, window=window)
X3 = np.log(abs(X3) + 1e-2)
stft1 = Spectrogram.CQT2010(sr=fs, fmin=f0, filter_scale=filter_scale,
n_bins=n_bins, bins_per_octave=bins_per_octave, window=window, pad_mode='constant')
X1 = stft1(torch.tensor(x, device=device).unsqueeze(0), normalization_type=normalization_type)
X1 = torch.log(X1 + 1e-2)
fig, axes = plt.subplots(1, 2, figsize=(12, 4), dpi=200)
axes[0].imshow(X1[0,:,:], aspect='auto', origin='lower')
axes[0].set_title('CQT2010')
axes[1].imshow(X2[0,:,:], aspect='auto', origin='lower')
axes[1].set_title('CQT2010v2')
# axes[1,0].imshow(X3[:,:], aspect='auto', origin='lower')
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (13 by maintainers)
Top Results From Across the Web
nnAudio: An on-the-fly GPU Audio to Spectrogram ... - arXiv
A schematic diagram showing our proposed improvement of the. CQT2010 algorithm that uses only time domain CQT kernels. Note that the output of...
Read more >Introduction — 0.2.0 - Kin Wai Cheuk
The result for CQT1992 is smoother than CQT2010 and librosa. Since librosa and CQT2010 are using the same algorithm (downsampling approach as mentioned...
Read more >Electrochemical Synthesis of Mesoporous CoPt Nanowires for ...
A new electrochemical method to synthesize mesoporous nanowires of alloys has been developed. Electrochemical deposition in ionic liquid-in-water (IL/W) ...
Read more >nnAudio: An on-the-fly GPU Audio to Spectrogram ... - Kat Agres
output of nnAudio versus a popular python signal processing library, librosa. ... The improved version of CQT2010 can be executed in.
Read more >Mr_KnowNothing | Discussion Master - Kaggle
When this competition first started on March 24th , I had never worked on NLP problem before , not even the basic ones...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Definitely, a pull request is welcome. I will try it on my GPU later.
Sure! Number 1 is already done, let me experiment with how can we do number 2 without breaking things that are working now.