question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Mel-spectrum of 29.1s segment

See original GitHub issue

Hi, thanks for the effort. I try to use the Mel-spectrum downloaded from gdrive to run the baseline but found that the downloaded files are full song. As a result, I try to run scripts/melspectrograms.py to get Mel-Spectrogram of 29.1s segment. However, I kept getting the error below:

RuntimeError: Error while configuring MelBands: Parameter normalize = "unit_tri" is not within specified range: {unit_sum,unit_max}

May I ask what did I miss? Thanks for the help.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
dbogdanovcommented, Jul 16, 2019

@annahung31 We have updated or PyPi wheels with the newest version of Essentia. Install or upgrade to the latest Essentia from pip and you should be able to run the spectrogram extraction code without a problem.

1reaction
rfalcon100commented, Jul 16, 2019

To avoid recomputing all spectrograms, I made a small change to the dataset, so that every file is now cropped to the desired shape of (96, 1366). This is probably not the best way to do it, but it works. The getitem method looks like:

def __getitem__(self, index):
    fn = os.path.join(self.root, 'data/raw_30s_specs/', self.dictionary[index]['path'][:-3]+'npy')
    audio = np.array(np.load(fn)).astype('float32')
    tags = self.dictionary[index]['tags']

    # Transforms
    self.transform = transforms.Compose([
        transforms.ToPILImage(),
        transforms.CenterCrop((96, 1366)),
        transforms.ToTensor(),
     ])

    if self.transform:
        audio = self.transform(audio)

    return audio, tags.astype('float32')

There is another change needed in the model, because now the batches have shape (batch, channels, width, height) , so no need to unsqueeze.

    def forward(self, x):
        #x = x.unsqueeze(1) 

        # init bn
        x = self.bn_init(x)

        # layer 1
        x = self.mp_1(nn.ELU()(self.bn_1(self.conv_1(x))))
        # layer 2
        x = self.mp_2(nn.ELU()(self.bn_2(self.conv_2(x))))
        # layer 3
        x = self.mp_3(nn.ELU()(self.bn_3(self.conv_3(x))))
        # layer 4
        x = self.mp_4(nn.ELU()(self.bn_4(self.conv_4(x))))
        # layer 5
        x = self.mp_5(nn.ELU()(self.bn_5(self.conv_5(x))))

        # classifier
        x = x.view(x.size(0), -1)
        x = self.dropout(x)
        logit = nn.Sigmoid()(self.dense(x))

        return logit
Read more comments on GitHub >

github_iconTop Results From Across the Web

Minz Won - Tesis Doctorals en Xarxa
step, a 29.1s audio segment is converted to a 96 × 1366 mel spectrogram. It is then used as an input and is...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found