fairseq-preprocess does not work while training custom model
See original GitHub issue🐛 Bug
Following the tutorial available at https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.pretraining.md to train a custom model using Roberta. Gets stuck with at the preprocess step with a few errors.
To Reproduce
Steps to reproduce the behavior (always include the command you ran):
Follow the exact steps in the tutorial Running this command throws the error.
fairseq-preprocess --only-source --srcdict gpt2_bpe/dict.txt --trainpref wikitext-103-raw/wiki.train.bpe --validpref wikitext-103-raw/wiki.valid.bpe --testpref wikitext-103-raw/wiki.test.bpe --destdir data-bin/wikitext-103 --workers 60
Stack trace:
Traceback (most recent call last): File “/usr/local/bin/fairseq-preprocess”, line 33, in <module> sys.exit(load_entry_point(‘fairseq’, ‘console_scripts’, ‘fairseq-preprocess’)()) File “/usr/local/bin/fairseq-preprocess”, line 25, in importlib_load_entry_point return next(matches).load() File “/usr/lib/python3.8/importlib/metadata.py”, line 77, in load module = import_module(match.group(‘module’)) File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “<frozen importlib._bootstrap>”, line 1014, in _gcd_import File “<frozen importlib._bootstrap>”, line 991, in _find_and_load File “<frozen importlib._bootstrap>”, line 975, in _find_and_load_unlocked File “<frozen importlib._bootstrap>”, line 671, in _load_unlocked File “<frozen importlib._bootstrap_external>”, line 783, in exec_module File “<frozen importlib._bootstrap>”, line 219, in _call_with_frames_removed File “/home/adutta/Documents/git_repo/fairseq/fairseq_cli/preprocess.py”, line 18, in <module> from fairseq import options, tasks, utils File “/home/adutta/Documents/git_repo/fairseq/fairseq/init.py”, line 32, in <module> import fairseq.criterions # noqa File “/home/adutta/Documents/git_repo/fairseq/fairseq/criterions/init.py”, line 36, in <module> importlib.import_module(“fairseq.criterions.” + file_name) File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “/home/adutta/Documents/git_repo/fairseq/fairseq/criterions/label_smoothed_cross_entropy_latency_augmented.py”, line 6, in <module> from examples.simultaneous_translation.utils.latency import LatencyTraining File “/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/init.py”, line 6, in <module> from . import criterions, eval, models # noqa File “/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/models/init.py”, line 13, in <module> importlib.import_module( File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/models/transformer_monotonic_attention.py”, line 13, in <module> from fairseq.models import ( File “/home/adutta/Documents/git_repo/fairseq/fairseq/models/init.py”, line 208, in <module> module = importlib.import_module(“fairseq.models.” + model_name) File “/usr/lib/python3.8/importlib/init.py”, line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File “/home/adutta/Documents/git_repo/fairseq/fairseq/models/speech_to_text/init.py”, line 8, in <module> from .convtransformer_simul_trans import * # noqa File “/home/adutta/Documents/git_repo/fairseq/fairseq/models/speech_to_text/convtransformer_simul_trans.py”, line 8, in <module> from examples.simultaneous_translation.models.transformer_monotonic_attention import ( ImportError: cannot import name ‘TransformerMonotonicDecoder’ from partially initialized module ‘examples.simultaneous_translation.models.transformer_monotonic_attention’ (most likely due to a circular import) (/home/adutta/Documents/git_repo/fairseq/examples/simultaneous_translation/models/transformer_monotonic_attention.py)
Environment
- fairseq Version (e.g., 1.0 or master): master (latest pull from Github)
- PyTorch Version (e.g., 1.0): 1.7.1
- OS (e.g., Linux): Ubuntu 20.04
- How you installed fairseq (
pip
, source): source - Build command you used (if compiling from source): git clone https://github.com/pytorch/fairseq; cd fairseq; pip3 install --editable ./
- Python version: 3.8
- CUDA/cuDNN version: 11.2
- GPU models and configuration: GeForce RTX 2080
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:6 (2 by maintainers)
Top GitHub Comments
Currently I revert to a commit before the file was changed and rebuild. Not good, but it works:
Error is gone for me with current master.