question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Mel-Spectrogram Image Generation Issues

See original GitHub issue

Hi, I’m trying to troubleshoot an error with my mel-spectrogram image generation. The saved .png image looks like the following: image

Here is the code I am using:

var sg = new SpectrogramGenerator(sampleRate, FftSize, StepSize, 0, MaxFreq);
sg.Add(audio);

var bitmap = sg.GetBitmapMel(MelBinCount, Intensity, SaveAsDb);
bitmap.Save(file + "MelSpec.png", ImageFormat.Png);

where: FFTSize = 2048 StepSize = 300 MaxFreq = 3000 Intensity = 5 MelBinCount = 200 SaveAsDb = false

Also, I am using the ReadWAV method as it is shown in the readme to read .wav audio inputs.

Any help would be appreciated.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
jwoodtkecommented, Aug 5, 2021

You were totally right @swharden! I think another problem was the buffer array was in bytes rather than floats.

However, now the mel-spectrograms seem to be missing the bass component. Here is a sample image:

pop 00013MelSpec

Also, here is the new code for audio reads:

        private async Task<(double[] audio, int sampleRate, double length)> ReadWav(string file, double multiplier = 16000)
        {
            await using (var afr = new AudioFileReader(file))
            {
                int sampleRate = afr.WaveFormat.SampleRate;
                int sampleCount = (int) (afr.Length / afr.WaveFormat.BitsPerSample / 8);
                int channelCount = afr.WaveFormat.Channels;
                var audio = new List<double>(sampleCount);
                var buffer = new float[sampleRate * channelCount];
                var length = afr.TotalTime;
                int samplesRead = 0;
                while ((samplesRead = await PopulateBufferArray(afr, buffer)) > 0)
                {
                    await AddAudioRange(audio, buffer.Take(samplesRead).Select(x => x * multiplier));
                }
                return (audio.ToArray(), sampleRate, length.TotalSeconds);
            }
        }

        private async Task<int> PopulateBufferArray(AudioFileReader afr, float[] buffer)
        {
            return afr.Read(buffer, 0, buffer.Length);
        }
        private async Task AddAudioRange(List<double> audio, IEnumerable<double> samplesRead)
        {
            audio.AddRange(samplesRead);
        }
0reactions
jwoodtkecommented, Aug 5, 2021

Nevermind, I think the mel bin count I was using was too high. It’s all iterative adjustment form here on out, thanks again for the help!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Possibility to save a mel-spectrogram as an image, and ...
I know you can use librosa.feature.melspectrogram() to generate mel-spectrograms of audio, but what about being able to save the ...
Read more >
Audio Deep Learning Made Simple (Part 2): Why Mel ...
A Gentle Guide to processing audio data in Python. What are Mel Spectrograms and how to generate them, in Plain English.
Read more >
What's wrong with CNNs and spectrograms for audio ...
CNNs do amazing things for images, but why do they not work as well for sound? This article explores the differences between sound...
Read more >
How to Create & Understand Mel-Spectrograms
Spectrograms are immensely useful tools that we can use to help dissect information from audio files and process it into images.
Read more >
Mel Spectrogram-based advanced deep temporal ...
This approach can use mel spectrogram image based unsupervised data to identify the characteristics of the fundamental structure of insufficient ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found