Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug(?) when batching wav files that are too short

See original GitHub issue

Hello,

I’m running into issues when using upstream models on audio files that are too short. The model I’m using is audio_albert and the process is killed when the batch includes short waveforms but is processed ok individually.

For example, for a batch with wav with lengths [torch.Size([98399]), torch.Size([311360]), torch.Size([70561]), torch.Size([16801]), torch.Size([5280]), torch.Size([30240]), torch.Size([15040]), torch.Size([1280])], the program is killed without further error messages when I run model(wavs), but I am able to run model(wavs[0]), model(wavs[7]) etc without issue.

I think at least a more informative error message or some sort of warning when input waveforms are too short would help.

Thanks for doing all this, great codebase!

Issue Analytics

State:
Created 3 years ago
Comments:8 (3 by maintainers)

Top GitHub Comments

2reactions

andi611commented, Feb 22, 2021

Hi!

Sure, here it is:

import torch
import random
random.seed(0)

model = torch.hub.load("s3prl/s3prl", "audio_albert")
lengths = [3898720, 98399, 311360, 70561, 16801, 5280, 30240, 15040, 1280]
wavs = []

for i in lengths:
    wavs.append(torch.rand(i))

for wav in wavs:
    f = model([wav])
    print(f[0].shape)

feats = model(wavs)
print([x.shape for x in feats])

Hi,

I think the problem is you are doing this on CPU instead of GPU. You can simply add .to('cuda') after model and every input to load them to your GPU. An example of doing extraction on GPU is given here. Let us know if you still encounter this problem after moving to GPU, thanks!

Andy

0reactions

trangham283commented, Feb 22, 2021

Hi @andi611

Moving everything on GPU resulted in a more informative error message (OOM). So the issue with the process being “killed” seems to be that CPU doesn’t have enough memory either, but doesn’t give an informative error message?

Thanks for looking into this.