InverseSpectrogram transform
See original GitHub issue🚀 Feature
I’d like to propose adding of InverseSpectrogram
transform to torchaudio
Motivation
istft
is already present in torchaudio.functional
and if users want to build a cepstrum, they have to wrap the functional method into a torch module themselves.
Pitch
Adding a wrapper for functinal.istft
, similar to how Spectrogram
transform wraps F.spectrogram
Additional context
This thought arose from the fact that I was thinking about quick ways to implement various non-mel scale cepstral coefficients in pytorch (LFCC, CQCC, LPCC, BFCC, etc). Implementing all the various filterbanks seemed like it would overload the library. Adding the transform would alleviate the complexity needed to extract various cepstral coefficients.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Audio Spectrogram Transformer - Hugging Face
The Audio Spectrogram Transformer applies a Vision Transformer to audio, by turning audio into an image (spectrogram). The model obtains state-of-the-art ...
Read more >MelGAN-based spectrogram inversion using feature matching
In this tutorial, we will have a look at the MelGAN architecture and how it can achieve fast spectral inversion, i.e. conversion of...
Read more >Is there a way to invert a spectrogram back to signal
No, this is not possible. To calculate a spectrogram, you divide your input time-domain signal into (half overlapping) chunks of data, ...
Read more >torchaudio.transforms - PyTorch
Create an inverse spectrogram to recover an audio signal from a spectrogram. MelSpectrogram. Create MelSpectrogram for a raw audio signal.
Read more >iSTFTNet: Fast and Lightweight Mel-Spectrogram ... - DeepAI
A mel-spectrogram vocoder must solve three inverse problems: recovery of the ... “Neural speech synthesis with transformer network,”.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
You’re correct that they should be close. The question is how close. Will open a PR an we can see there how close they are.
Good point 😃 Yes, that sounds like a good addition.
For testing, how close are the original waveform and the reconstructed waveform? Since stft and istft are close, I would expect Spectrogram and InverseSpectrogram to be close. Is that correct? Otherwise, do you have a reference to match? I don’t see it in librosa.