question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training on CommonVoice standard recipe crashes

See original GitHub issue

I was training ASR model using CommonVoice recipe from here: https://github.com/speechbrain/speechbrain/blob/develop/recipes/CommonVoice/ASR/seq2seq/hparams/train_en.yaml

After 920 iterations, training crashed with the following error:

Traceback (most recent call last):
  File "/research_shared/home/eugene/speechbrain/recipes/CommonVoice/ASR/seq2seq/train.py", line 333, in <module>
    asr_brain.fit(
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/core.py", line 1011, in fit
    for batch in t:
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
    return self._process_data(data)
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
    data.reraise()
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/_utils.py", line 429, in reraise
    raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/dataio/batch.py", line 124, in __init__
    padded = PaddedData(*padding_func(values, **padding_kwargs))
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/utils/data_utils.py", line 414, in batch_pad_right
    padded, valid_percent = pad_right_to(
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/utils/data_utils.py", line 342, in pad_right_to
    assert len(target_shape) == tensor.ndim
AssertionError

When running with -O flag, got the following error:

...
   File "/research_shared/home/eugene/speechbrain/speechbrain/utils/data_utils.py", line 420, in batch_pad_right
    batched = torch.stack(batched)
RuntimeError: stack expects each tensor to be equal size, but got [131328] at entry 0 and [131328, 1] at entry 8

Tried both the latest develop branch and v0.5.5 PyPI installation, got the same error.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
TParcolletcommented, Apr 15, 2021

@EugKar Could you try to add this code and let me know if it works ?

if info.num_channels > 1:
    sig = torch.mean(sig, dim=1)

(On the audio_pipeline function).

1reaction
TParcolletcommented, Apr 15, 2021

Right, this is something I need to fix. The problem is simple: For some reason, CommonVoice has stereo files. I will fix this ASAP.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Commits · master · Simon Will / kaldi-commonvoice · GitLab
Kaldi Fork implementing Mozilla Common Voice Training. ... [egs] Fix perl encoding bug (was causing crashes) (#3442) · ab4eca0c.
Read more >
speechbrain/asr-wav2vec2-commonvoice-en - Hugging Face
Tokenizer (unigram) that transforms words into subword units and trained with the train transcriptions (train.tsv) of CommonVoice (EN). Acoustic model (wav2vec2 ...
Read more >
Versions of Kaldi
nnet3 training script simplifications and refactoring. Some of the recipes are upgraded to include dropout and the –proportional-shrink option (which ...
Read more >
Law Enforcement Intelligence - Bureau of Justice Assistance
Next, this chapter provides a framework for national recommendations and professional standards for the practice of intelligence. Finally, the discussion ...
Read more >
Formatting your training data for DeepSpeech
A crash course for training speech recognition models using DeepSpeech. ... If you are using data from Common Voice for training a model,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found