question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug]: Training on CommonVoice Speech classification recipe crashes with AssertionError assert len(target_shape) == tensor.ndim

See original GitHub issue

Describe the bug

This is similar to the issue: https://github.com/speechbrain/speechbrain/issues/651

However, that was for ASR, this is speaker classification.

Training crashed with the error given in the log:

Expected behaviour

Pass the epoch

To Reproduce

Converted the commonvoice audio files to wav format before processing using train.py

Versions

I’m using the commonvoice German data

Relevant log output

valid_loader_kwargs=hparams["dataloader_options"],
  File "E:\Study\Thesis\Voice print\speechbrain\speechbrain\core.py", line 1156, in fit
    self._fit_train(train_set=train_set, epoch=epoch, enable=enable)
  File "E:\Study\Thesis\Voice print\speechbrain\speechbrain\core.py", line 1008, in _fit_train
    for batch in t:
  File "C:\Python\Python37\lib\site-packages\tqdm\std.py", line 1195, in __iter__
    for obj in iterable:
  File "C:\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "C:\Python\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "C:\Python\Python37\lib\site-packages\torch\utils\data\_utils\fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "E:\Study\Thesis\Voice print\speechbrain\speechbrain\dataio\batch.py", line 125, in __init__
    padded = PaddedData(*padding_func(values, **padding_kwargs))
  File "E:\Study\Thesis\Voice print\speechbrain\speechbrain\utils\data_utils.py", line 445, in batch_pad_right
    t, max_shape, mode=mode, value=value
  File "E:\Study\Thesis\Voice print\speechbrain\speechbrain\utils\data_utils.py", line 372, in pad_right_to
    assert len(target_shape) == tensor.ndim
AssertionError

Additional context

No response

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:6

github_iconTop GitHub Comments

1reaction
praveenmathew93commented, Dec 1, 2022

I added the code and it seems to have worked. Got through the iteration.

Thank you so much @AsuMagic and @TParcollet for helping me understand the signal. From the linked issue I was not able to figure out what ‘channels’ represent. So the increase in dimension is basically an increase in the number of channels.

Thanks again! Closing the issue!

1reaction
AsuMagiccommented, Dec 1, 2022

Plus the values look weird.

If you’re referring to the first and last few values, not really. Those are just close to zero and about what I’d expect for normalized float audio, it’s just silence here.

Is there a way I can accommodate them?

The issue linked gives a solution. You could try adding something like this after the read_audio line:

if sig.dim() > 1:
    sig = torch.mean(sig, dim=1)

The shape of a signal in mono is (number_of_samples,).
The shape of a signal in stereo, as can be seen from the printed value here, is (number_of_samples, number_of_channels).

So if the signal tensor has a second dimension we can assume it’s the number of channels. Taking the mean of both channels is a straightforward and usual way to downmix stereo to mono.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Training on CommonVoice standard recipe crashes · Issue #651
I was training ASR model using CommonVoice recipe from here: ... in pad_right_to assert len(target_shape) == tensor.ndim AssertionError.
Read more >
using pandas dataframes fill rows based on condition if both values ...
Assertion error while reading csv with delimter ¶ · Missing column value when merge tables in python · Python requests waiting for js...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found