Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

fairseq-preprocess does not work while training custom model

See original GitHub issue

🐛 Bug

Following the tutorial available at https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.pretraining.md to train a custom model using Roberta. Gets stuck with at the preprocess step with a few errors.

To Reproduce

Steps to reproduce the behavior (always include the command you ran):

Follow the exact steps in the tutorial Running this command throws the error.

fairseq-preprocess --only-source --srcdict gpt2_bpe/dict.txt --trainpref wikitext-103-raw/wiki.train.bpe --validpref wikitext-103-raw/wiki.valid.bpe --testpref wikitext-103-raw/wiki.test.bpe --destdir data-bin/wikitext-103 --workers 60

Stack trace:

Traceback (most recent call last): File “/usr/local/bin/fairseq-preprocess”, line 33, in <module> sys.exit(load_entry_point(‘fairseq’, ‘console_scripts’, ‘fairseq-preprocess’)()) File “/usr/local/bin/fairseq-preprocess”, line 25, in importlib_load_entry_point return next(matches).load() File “/usr/lib/python3.8/importlib/metadata.py”, line 77, in load module = import_module(match.group(‘module’)) File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “<frozen importlib._bootstrap>”, line 1014, in _gcd_import File “<frozen importlib._bootstrap>”, line 991, in _find_and_load File “<frozen importlib._bootstrap>”, line 975, in _find_and_load_unlocked File “<frozen importlib._bootstrap>”, line 671, in _load_unlocked File “<frozen importlib._bootstrap_external>”, line 783, in exec_module File “<frozen importlib._bootstrap>”, line 219, in _call_with_frames_removed File “/home/adutta/Documents/git_repo/fairseq/fairseq_cli/preprocess.py”, line 18, in <module> from fairseq import options, tasks, utils File “/home/adutta/Documents/git_repo/fairseq/fairseq/init.py”, line 32, in <module> import fairseq.criterions # noqa File “/home/adutta/Documents/git_repo/fairseq/fairseq/criterions/init.py”, line 36, in <module> importlib.import_module(“fairseq.criterions.” + file_name) File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “/home/adutta/Documents/git_repo/fairseq/fairseq/criterions/label_smoothed_cross_entropy_latency_augmented.py”, line 6, in <module> from examples.simultaneous_translation.utils.latency import LatencyTraining File “/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/init.py”, line 6, in <module> from . import criterions, eval, models # noqa File “/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/models/init.py”, line 13, in <module> importlib.import_module( File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/models/transformer_monotonic_attention.py”, line 13, in <module> from fairseq.models import ( File “/home/adutta/Documents/git_repo/fairseq/fairseq/models/init.py”, line 208, in <module> module = importlib.import_module(“fairseq.models.” + model_name) File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “/home/adutta/Documents/git_repo/fairseq/fairseq/models/speech_to_text/init.py”, line 8, in <module> from .convtransformer_simul_trans import * # noqa File “/home/adutta/Documents/git_repo/fairseq/fairseq/models/speech_to_text/convtransformer_simul_trans.py”, line 8, in <module> from examples.simultaneous_translation.models.transformer_monotonic_attention import ( ImportError: cannot import name ‘TransformerMonotonicDecoder’ from partially initialized module ‘examples.simultaneous_translation.models.transformer_monotonic_attention’ (most likely due to a circular import) (/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/models/transformer_monotonic_attention.py)

Environment

fairseq Version (e.g., 1.0 or master): master (latest pull from Github)
PyTorch Version (e.g., 1.0): 1.7.1
OS (e.g., Linux): Ubuntu 20.04
How you installed fairseq (pip, source): source
Build command you used (if compiling from source): git clone https://github.com/pytorch/fairseq; cd fairseq; pip3 install --editable ./
Python version: 3.8
CUDA/cuDNN version: 11.2
GPU models and configuration: GeForce RTX 2080

Issue Analytics

State:
Created 3 years ago
Reactions:2
Comments:6 (2 by maintainers)

Top GitHub Comments

2reactions

olafthielecommented, Feb 21, 2021

Currently I revert to a commit before the file was changed and rebuild. Not good, but it works:

git checkout da9eaba12d82b9bfc1442f0e2c6fc1b895f4d35d
pip install --editable ./

1reaction

olafthielecommented, Feb 24, 2021

Error is gone for me with current master.

Top Results From Across the Web

fairseq-preprocess does not work while training custom model

Running this command throws the error. fairseq-preprocess --only-source --srcdict gpt2_bpe/dict.txt --trainpref wikitext-103-raw/wiki.train.bpe ...

Command-line Tools — fairseq 0.12.2 documentation

Fairseq provides several command-line tools for training and evaluating models: fairseq-preprocess: Data pre-processing: build vocabularies and binarize ...

Guidance on using FAIRseq for seq2seq tasks - Google Groups

Hi, I want to use FairSeq for a custom seq2seq task but had a few doubts using it:- What is the format required...

fairseq Users | Hi, | Facebook

Hi, I'm using a big model from fairseq. And I'm experimenting using pre-trained embedding with fairseq. But when i start training with embeddings......

fairseq/examples/translation/README.md - Hugging Face

Training a new model. IWSLT'14 German to English (Transformer). The following instructions can be used to train ...