Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to train a ASR/ST model on MUST-C data.

See original GitHub issue

Hi I am trying to train a new ASR model by following the steps available here

I downloaded MUST-C version 2.0 data availabe here

Unzipping the tar file gives a folder titled en-de which has the following contents two folders data and doc

data: dev train tst-COMMON tst-HE

Then I preprocessed the data using the following command

python fairseq/examples/speech_to_text/prep_mustc_data.py --data-root mustcv2/ --task asr --vocab-type unigram --vocab-size 5000

the mustcv2 folder has the en-de folder which was unzipped earlier.

the preprocessing ran successfully populating the en-de folder as below

Then I tried to train the model using the command

fairseq-train mustcv2/en-de/ --config-yaml mustcv2/en-de/config_asr.yaml --train-subset train_asr --valid-subset dev_asr --save-dir checkpoints/asr/ --num-workers 4 --max-tokens 40000 --max-update 100000 --task speech_to_text --criterion label_smoothed_cross_entropy --report-accuracy --arch s2t_transformer_s --optimizer adam --lr 1e-3 --lr-scheduler inverse_sqrt --warmup-updates 10000 --clip-norm 10.0 --seed 1 --update-freq 8

Which led to error saying dict.txt was not present.

From my previous experinece of using fairseq I copied the spm_unigram5000_asr.txt to dict.txt and ran the training command again. For which I am getting the below error.

Traceback (most recent call last):                                                                                                                                     
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/bin/fairseq-train", line 33, in <module>
    sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq_cli/train.py", line 491, in cli_main
    distributed_utils.call_main(cfg, main)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/distributed/utils.py", line 369, in call_main
    main(cfg, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq_cli/train.py", line 169, in main
    valid_losses, should_stop = train(cfg, trainer, task, epoch_itr)
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq_cli/train.py", line 279, in train
    log_output = trainer.train_step(samples)
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/trainer.py", line 668, in train_step
    ignore_grad=is_dummy_batch,
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/tasks/fairseq_task.py", line 475, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/criterions/label_smoothed_cross_entropy.py", line 79, in forward
    net_output = model(**sample["net_input"])
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/models/speech_to_text/s2t_transformer.py", line 268, in forward
    encoder_out = self.encoder(src_tokens=src_tokens, src_lengths=src_lengths)
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/models/speech_to_text/s2t_transformer.py", line 337, in forward
    if self.num_updates < self.encoder_freezing_updates:
TypeError: '<' not supported between instances of 'int' and 'NoneType'

The encoder_freezing_updates is being set to Null

hence I changed the code in s2t_transformer.py: line 337

to below and ran the training command again,

for which I am getting the below error.

Traceback (most recent call last):                                                                                                                                     
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/bin/fairseq-train", line 33, in <module>
    sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq_cli/train.py", line 491, in cli_main
    distributed_utils.call_main(cfg, main)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/distributed/utils.py", line 369, in call_main
    main(cfg, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq_cli/train.py", line 169, in main
    valid_losses, should_stop = train(cfg, trainer, task, epoch_itr)
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq_cli/train.py", line 279, in train
    log_output = trainer.train_step(samples)
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/contextlib.py", line 74, in inner
    return func(*args, **kwds)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/trainer.py", line 668, in train_step
    ignore_grad=is_dummy_batch,
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/tasks/fairseq_task.py", line 475, in train_step
    loss, sample_size, logging_output = criterion(model, sample)
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/criterions/label_smoothed_cross_entropy.py", line 79, in forward
    net_output = model(**sample["net_input"])
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/models/speech_to_text/s2t_transformer.py", line 270, in forward
    prev_output_tokens=prev_output_tokens, encoder_out=encoder_out
  File "/fastdisk/Sugeeth/miniconda3/envs/offline/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/models/transformer.py", line 823, in forward
    alignment_heads=alignment_heads,
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/models/speech_to_text/s2t_transformer.py", line 396, in extract_features
    alignment_heads,
  File "/fastdisk/Sugeeth/offline/fairseq/fairseq/models/transformer.py", line 890, in extract_features_scriptable
    padding_mask = encoder_out["encoder_padding_mask"][0]
IndexError: list index out of range

printing the encoder_out["encoder_padding_mask"] in transformer.py shows empty list being passed the second time the function is called as seen below.

The same issue is occuring with speech_translation

Please let me know if I am doing anything wrong here I am using the latest fairseq just cloned now from master branch. Torch version is 1.8.1+cu102 and using Ubuntu 20.04

My apologies if it not a bug. Please let me know how I can train the same.

Thanks.

Issue Analytics

State:
Created 2 years ago
Comments:7

Top GitHub Comments

1reaction

bhaddowcommented, Apr 8, 2021

The first error you got above suggests that fairseq is not finding config_asr..yaml . When I train ASR or ST, I do not provide a path to the config_asr.yaml , just the file name itself. I think fairseq prepends the data directory path to the config file path, and silently ignores the config file if it cannot find it.

I am training successfully with commit 1c0439b7da

0reactions

holymacommented, Sep 23, 2021

@sugeeth14 @Atla11nTa Have you train ASR/ST on mustc data and got the published results？

Top Results From Across the Web

S2T Example: Speech Translation (ST) on MuST-C - GitHub

We match the state-of-the-art performance in ESPNet-ST with a simpler model training pipeline. Data Preparation. Download and unpack MuST-C data to a path...

Can You Hear It? Backdoor Attacks via Ultrasonic Triggers

The attacker can inject only a small set of poisoned data into the training dataset and has no knowledge of the model architecture...

Fine-Tune Wav2Vec2 for English ASR with Transformers

Wav2Vec2 is a pretrained model for Automatic Speech Recognition (ASR) and ... small Timit dataset that contains just 5h of training data.

STORK training critical when lives of moms, babies in jeopardy

It's not an unusual scenario: A rural Mississippi mom goes into labor at home. She's bleeding and her baby is at just 26...

KU-Undergraduate-Catalog.pdf - Keiser University

Policy on Transfer Credit for Military Training and Education . ... unable to locate the correct information about professional licensure, ...