Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training on CommonVoice standard recipe crashes

See original GitHub issue

I was training ASR model using CommonVoice recipe from here: https://github.com/speechbrain/speechbrain/blob/develop/recipes/CommonVoice/ASR/seq2seq/hparams/train_en.yaml

After 920 iterations, training crashed with the following error:

Traceback (most recent call last):
  File "/research_shared/home/eugene/speechbrain/recipes/CommonVoice/ASR/seq2seq/train.py", line 333, in <module>
    asr_brain.fit(
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/core.py", line 1011, in fit
    for batch in t:
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/tqdm/std.py", line 1178, in __iter__
    for obj in iterable:
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
    return self._process_data(data)
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
    data.reraise()
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/_utils.py", line 429, in reraise
    raise self.exc_type(msg)
AssertionError: Caught AssertionError in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/dataio/batch.py", line 124, in __init__
    padded = PaddedData(*padding_func(values, **padding_kwargs))
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/utils/data_utils.py", line 414, in batch_pad_right
    padded, valid_percent = pad_right_to(
  File "/research_shared/home/eugene/miniconda/envs/speechbrain/lib/python3.9/site-packages/speechbrain/utils/data_utils.py", line 342, in pad_right_to
    assert len(target_shape) == tensor.ndim
AssertionError

When running with -O flag, got the following error:

...
   File "/research_shared/home/eugene/speechbrain/speechbrain/utils/data_utils.py", line 420, in batch_pad_right
    batched = torch.stack(batched)
RuntimeError: stack expects each tensor to be equal size, but got [131328] at entry 0 and [131328, 1] at entry 8

Tried both the latest develop branch and v0.5.5 PyPI installation, got the same error.

Issue Analytics

State:
Created 2 years ago
Comments:7 (2 by maintainers)

Top GitHub Comments

1reaction

TParcolletcommented, Apr 15, 2021

@EugKar Could you try to add this code and let me know if it works ?

if info.num_channels > 1:
    sig = torch.mean(sig, dim=1)

(On the audio_pipeline function).

1reaction

TParcolletcommented, Apr 15, 2021

Right, this is something I need to fix. The problem is simple: For some reason, CommonVoice has stereo files. I will fix this ASAP.

Top Results From Across the Web

Commits · master · Simon Will / kaldi-commonvoice · GitLab

Kaldi Fork implementing Mozilla Common Voice Training. ... [egs] Fix perl encoding bug (was causing crashes) (#3442) · ab4eca0c.

speechbrain/asr-wav2vec2-commonvoice-en - Hugging Face

Tokenizer (unigram) that transforms words into subword units and trained with the train transcriptions (train.tsv) of CommonVoice (EN). Acoustic model (wav2vec2 ...

Versions of Kaldi

nnet3 training script simplifications and refactoring. Some of the recipes are upgraded to include dropout and the –proportional-shrink option (which ...

Law Enforcement Intelligence - Bureau of Justice Assistance

Next, this chapter provides a framework for national recommendations and professional standards for the practice of intelligence. Finally, the discussion ...

Formatting your training data for DeepSpeech

A crash course for training speech recognition models using DeepSpeech. ... If you are using data from Common Voice for training a model,...