Errors while replicating Example 3: ST on CoVoST
See original GitHub issueπ Bug
Hi, thank you for providing such a nice toolkit. Iβm currently working on a research about speech translation, and tried to replicate your experiment. I followed the instructions here, but could not reproduce your results.
There are mainly 4 problems:
- Bug 1: Some python packages are missing
- Bug 2: Some CoVoST2 audio files are empty
- Bug 3: Path in config file is incorrect
- Bug 4: Tuple indices must be integers
Detailed error messages are at the bottom of this report.
To Reproduce
Steps to reproduce the behavior:
(command line options are same as provided ones, except that I didnβt set --num-workers
and --update-freq
when executing fairseq-train
)
- Install fairseq following instruction on Github.
- Run
examples/speech_to_text/prep_covost_data.py
. - You will see Bug 1.
- Install pandas, torchaudio, and sentencepiece using pip.
- Run
examples/speech_to_text/prep_covost_data.py
again. - You will see Bug 2 in some languages (at least I encountered this problem for En, De and Es).
- Find and remove empty audio files from
<covost root>/<lang>/raw/clips
; also remove corresponding tsv rows from tsv files. - Run
examples/speech_to_text/prep_covost_data.py
again. It will finish successfully. - Run
fairseq-train
command:
fairseq-train ${COVOST_ROOT} --train-subset train_asr_<lang> ...
- You will see Bug 3.
- Copy or rename config file:
cp config_asr_<lang>.yaml config.yaml
- Change audio_root in config.yaml line 1:
change
audio_root: <path to covost root>/<lang>
to
audio_root: <path to covost root>
- Run
fairseq-train
command:
fairseq-train ${COVOST_ROOT}/<lang> --train-subset train_asr_<lang> ...
- You will see Bug 4.
- Change
fairseq/fairseq/models/transformer.py
line 809 - 817 to:
encoder_out[0]
if (encoder_out is not None and len(encoder_out[0]) > 0)
else None,
encoder_out[1]
if (
encoder_out is not None
)
else None,
- Run
fairseq-train
command again, training will start successfully.
Expected behavior
All commands finishes without any error.
Environment
- fairseq version: Installed via following commands on Nov. 27, 2020 (commit dea66cc)
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./
- PyTorch version: 1.7.0
- OS: Ubuntu 20.04
- Python version: 3.8.5
- CUDA/cuDNN version: 10.2/8.0.3
- GPU models and configuration: GeForce GTX 1080 Ti x 8
Error messages
Error messages for Bug 1
<div>Traceback (most recent call last):
File "fairseq/examples/speech_to_text/prep_covost_data.py", line 16, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
Traceback (most recent call last):
File "fairseq/examples/speech_to_text/prep_covost_data.py", line 17, in <module>
import torchaudio
ModuleNotFoundError: No module named 'torchaudio'
...
Traceback (most recent call last):
File "fairseq/examples/speech_to_text/prep_covost_data.py", line 18, in <module>
from examples.speech_to_text.data_utils import (
File "<path to fairseq>/fairseq/examples/speech_to_text/data_utils.py", line 18, in <module>
import sentencepiece as sp
ModuleNotFoundError: No module named 'sentencepiece'
</div>
Error messages for Bug 2
<div>Fetching split train...
100%|ββββββββββββββββββββββββββββββββββ| 5.80G/5.80G [10:37<00:00, 9.76MB/s]
100%|ββββββββββββββββββββββββββββββββββ| 3.05M/3.05M [00:00<00:00, 3.66MB/s]
Extracting log mel filter bank features...
0%| | 0/79015 [00:00<?, ?it/s]<path to venv>/venv/lib/python3.8/site-packages/torchaudio/compliance/kaldi.py:574: UserWarning: The function torch.rfft is deprecated and will be removed in a future PyTorch release. Use the new torch.fft module functions, instead, by importing torch.fft and calling torch.fft.fft or torch.fft.rfft. (Triggered internally at /pytorch/aten/src/ATen/native/SpectralOps.cpp:590.)
fft = torch.rfft(strided_input, 1, normalized=False, onesided=True)
37%|ββββββββββββ | 29137/79015 [1:21:38<1:52:09, 7.41it/s]
formats: can't open input file `<path to covost root>/es/raw/clips/common_voice_es_19499893.mp3':
37%|βββββββββββ | 29138/79015 [1:21:38<2:19:45, 5.95it/s]
Traceback (most recent call last):
File "fairseq/examples/speech_to_text/prep_covost_data.py", line 294, in <module>
main()
File "fairseq/examples/speech_to_text/prep_covost_data.py", line 290, in main
process(args)
File "fairseq/examples/speech_to_text/prep_covost_data.py", line 222, in process
for waveform, sample_rate, _, _, _, utt_id in tqdm(dataset):
File "<path to venv>/venv/lib/python3.8/site-packages/tqdm/std.py", line 1193, in __iter__
for obj in iterable:
File "fairseq/examples/speech_to_text/prep_covost_data.py", line 201, in __getitem__
waveform, sample_rate = torchaudio.load(path)
File "<path to venv>/venv/lib/python3.8/site-packages/torchaudio/backend/sox_backend.py", line 48, in load
sample_rate = _torchaudio.read_audio_file(
RuntimeError: Error opening audio file
</div>
Error messages for Bug 3
<div> 2020-11-30 14:36:44 | INFO | fairseq.data.audio.speech_to_text_dataset | Cannot find <path to covost root>/config.yaml
Traceback (most recent call last):
File "<path to venv>/venv/bin/fairseq-train", line 33, in <module>
sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())
File "<path to fairseq>/fairseq/fairseq_cli/train.py", line 392, in cli_main
distributed_utils.call_main(cfg, main)
File "<path to fairseq>/fairseq/fairseq/distributed_utils.py", line 313, in call_main
torch.multiprocessing.spawn(
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 3 terminated with the following error:
Traceback (most recent call last):
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "<path to fairseq>/fairseq/fairseq/distributed_utils.py", line 300, in distributed_main
main(cfg, **kwargs)
File "<path to fairseq>/fairseq/fairseq_cli/train.py", line 66, in main
task = tasks.setup_task(cfg.task)
File "<path to fairseq>/fairseq/fairseq/tasks/__init__.py", line 44, in setup_task
return task.setup_task(cfg, **kwargs)
File "<path to fairseq>/fairseq/fairseq/tasks/speech_to_text.py", line 58, in setup_task
raise FileNotFoundError(f"Dict not found: {dict_path}")
FileNotFoundError: Dict not found: <path to covost root>/dict.txt
</div>
Error messages for Bug 4
<div>...
warnings.warn(
epoch 001: 0%| | 0/2 [00:00<?, ?it/s]2020-11-30 14:40:54 | WARNING | fairseq.logging.progress_bar | tensorboard not found, please install with: pip install tensorboardX
2020-11-30 14:40:54 | INFO | fairseq.trainer | begin training epoch 1
Traceback (most recent call last):
File "<path to venv>/venv/bin/fairseq-train", line 33, in <module>
sys.exit(load_entry_point('fairseq', 'console_scripts', 'fairseq-train')())
File "<path to fairseq>/fairseq/fairseq_cli/train.py", line 392, in cli_main
distributed_utils.call_main(cfg, main)
File "<path to fairseq>/fairseq/fairseq/distributed_utils.py", line 313, in call_main
torch.multiprocessing.spawn(
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 199, in spawn
return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 157, in start_processes
while not context.join():
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 118, in join
raise Exception(msg)
Exception:
-- Process 7 terminated with the following error:
Traceback (most recent call last):
File "<path to venv>/venv/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 19, in _wrap
fn(i, *args)
File "<path to fairseq>/fairseq/fairseq/distributed_utils.py", line 300, in distributed_main
main(cfg, **kwargs)
File "<path to fairseq>/fairseq/fairseq_cli/train.py", line 130, in main
valid_losses, should_stop = train(cfg, trainer, task, epoch_itr)
File "/usr/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "<path to fairseq>/fairseq/fairseq_cli/train.py", line 219, in train
log_output = trainer.train_step(samples)
File "/usr/lib/python3.8/contextlib.py", line 75, in inner
return func(*args, **kwds)
File "<path to fairseq>/fairseq/fairseq/trainer.py", line 540, in train_step
loss, sample_size_i, logging_output = self.task.train_step(
File "<path to fairseq>/fairseq/fairseq/tasks/fairseq_task.py", line 428, in train_step
loss, sample_size, logging_output = criterion(model, sample)
File "<path to venv>/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "<path to fairseq>/fairseq/fairseq/criterions/label_smoothed_cross_entropy.py", line 69, in forward
net_output = model(**sample["net_input"])
File "<path to venv>/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "<path to venv>/venv/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "<path to venv>/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "<path to fairseq>/fairseq/fairseq/models/speech_to_text/s2t_transformer.py", line 259, in forward
decoder_out = self.decoder(
File "<path to venv>/venv/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "<path to fairseq>/fairseq/fairseq/models/transformer.py", line 693, in forward
x, extra = self.extract_features(
File "<path to fairseq>/fairseq/fairseq/models/speech_to_text/s2t_transformer.py", line 381, in extract_features
x, _ = self.extract_features_scriptable(
File "<path to fairseq>/fairseq/fairseq/models/transformer.py", line 810, in extract_features_scriptable
if (encoder_out is not None and len(encoder_out["encoder_out"]) > 0)
TypeError: tuple indices must be integers or slices, not str
/usr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 8 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
</div>
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:22 (11 by maintainers)
Top Results From Across the Web
Errors while replicating Example 3: ST on CoVoST Β· Issue #2971
Run fairseq-train command again, training will start successfully. Expected behavior. All commands finishes without any error. Environment.
Read more >Mplus Discussion >> Errors for replication
Hello, I run an ESEM model (WLSMV estimator) using multiple imputation in Mplus version 5.21. All the requested replications were completedΒ ...
Read more >CoVoST Dataset - Papers With Code
CoVoST is a large-scale multilingual speech-to-text translation corpus. Its latest 2nd version covers translations from 21 languages into English and fromΒ ...
Read more >Towards Robust End-to-End Speech Translation - UPCommons
In this section, I describe the main ST datasets that are available up until today. I am presenting them in reverse chronological order,...
Read more >Introducing CVSS: A Massively Multilingual Speech-to ...
Automatic translation of speech from one language to speech in another ... CVSS is directly derived from the CoVoST 2 speech-to-text (ST)Β ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
All solved, thank you!
Please pull the latest master branch for the bug fix.