Mel-Spectrogram Image Generation Issues
See original GitHub issueHi, I’m trying to troubleshoot an error with my mel-spectrogram image generation. The saved .png image looks like the following:
Here is the code I am using:
var sg = new SpectrogramGenerator(sampleRate, FftSize, StepSize, 0, MaxFreq);
sg.Add(audio);
var bitmap = sg.GetBitmapMel(MelBinCount, Intensity, SaveAsDb);
bitmap.Save(file + "MelSpec.png", ImageFormat.Png);
where:
FFTSize = 2048
StepSize = 300
MaxFreq = 3000
Intensity = 5
MelBinCount = 200
SaveAsDb = false
Also, I am using the ReadWAV method as it is shown in the readme to read .wav audio inputs.
Any help would be appreciated.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Possibility to save a mel-spectrogram as an image, and ...
I know you can use librosa.feature.melspectrogram() to generate mel-spectrograms of audio, but what about being able to save the ...
Read more >Audio Deep Learning Made Simple (Part 2): Why Mel ...
A Gentle Guide to processing audio data in Python. What are Mel Spectrograms and how to generate them, in Plain English.
Read more >What's wrong with CNNs and spectrograms for audio ...
CNNs do amazing things for images, but why do they not work as well for sound? This article explores the differences between sound...
Read more >How to Create & Understand Mel-Spectrograms
Spectrograms are immensely useful tools that we can use to help dissect information from audio files and process it into images.
Read more >Mel Spectrogram-based advanced deep temporal ...
This approach can use mel spectrogram image based unsupervised data to identify the characteristics of the fundamental structure of insufficient ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
You were totally right @swharden! I think another problem was the buffer array was in bytes rather than floats.
However, now the mel-spectrograms seem to be missing the bass component. Here is a sample image:
Also, here is the new code for audio reads:
Nevermind, I think the mel bin count I was using was too high. It’s all iterative adjustment form here on out, thanks again for the help!