Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error with mutilingual model and fairseq-generate

See original GitHub issue

Hi, I got an error when generating with a trained multilingual model. I hope you can help me understand what went wrong and how to fix it. Some context: I’m basically trying to use the multilingual architecture as a multitask model to combine different datasets for a monolingual task (each task is a “language” pair).

The command used for training a many-to-one model (i.e. shared decoder) is:

CUDA_VISIBLE_DEVICES=1,2,3 fairseq-train "${data_dir}/bin" \
  --ddp-backend=no_c10d \
  --task multilingual_translation --lang-pairs orig-simp,complex-simp,long-simp \
  --arch multilingual_transformer \
  --share-decoders --share-decoder-input-output-embed \
  --encoder-embed-path "${glove}" --encoder-embed-dim 300 --encoder-ffn-embed-dim 300 \
  --decoder-embed-path "${glove}" --decoder-embed-dim 300 --decoder-ffn-embed-dim 300 \
  --encoder-attention-heads 5 --decoder-attention-heads 5 \
  --encoder-layers 4 --decoder-layers 4 \
  --optimizer adam --adam-betas '(0.9, 0.98)' \
  --lr 0.0005 --lr-scheduler inverse_sqrt --min-lr '1e-09' \
  --label-smoothing 0.1 --dropout 0.3 --weight-decay 0.0001 \
  --criterion label_smoothed_cross_entropy --max-update 10000 \
  --warmup-updates 4000 --warmup-init-lr '1e-07' \
  --max-tokens 4000 --update-freq 4 \
  --save-dir "${model_dir}" --tensorboard-logdir "${log_dir}" \

Training proceeds without problems. Now, I want to generate the output for the ‘test’ subset of one of the “language” pairs (orig-simp) that the model was trained on.

fairseq-generate "${data_dir}/bin" \
  --path "${model_dir}/${checkpoint_name}.pt" \
  --lang-pairs orig-simp,complex-simp,long-simp \
  --task multilingual_translation --source-lang orig --target-lang simp \
  --batch-size 128 --beam 5 --remove-bpe=sentencepiece \
  --gen-subset test > "${experiment_dir}/outputs/${output_name}.out"

After running the command I get the following error:

/experiments/falva/tools/fairseq/fairseq/models/fairseq_model.py:280: UserWarning: FairseqModel is deprecated, please use FairseqEncoderDecoderModel or BaseFairseqModel instead
  for key in self.keys
Traceback (most recent call last):
  File "/home/falva/anaconda3/envs/mtl4ts/bin/fairseq-generate", line 11, in <module>
    load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')()
  File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 190, in cli_main
    main(args)
  File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 47, in main
    task=task,
  File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
    ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
  File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 186, in load_model_ensemble_and_task
    model.load_state_dict(state['model'], strict=True, args=args)
TypeError: load_state_dict() got an unexpected keyword argument 'args'

Could you help me understand what’s going on?

Some additional and perhaps useful information and questions:

For all ‘language’ pairs, all dataset splits (train/valid/test) were binarized before training using fairseq-preprocess. That’s why I decided to use fairseq-generate instead of fairseq-interactive. I don’t think this could be the source of the problem, right? Or is there a particular reason why, for the multilingual model, it’s recommended to use interactive rather than generate as in the example you provide in your repo?
Since in this case I’m using a many-to-one model (just as in the example you provide), there is no need to use the --encoder-langtok or --decoder-langtok arguments. To my understanding, --encoder-langtok comes into play if I wanted to train a one-to-many model (--encoder-langtok tgt). But, when would --decoder-langtok be necessary in your experience?

Thank you in advance for all the help.

Issue Analytics

State:
Created 4 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

2reactions

ICEtingercommented, Nov 23, 2019

+1. I ran the commands shown in https://github.com/pytorch/fairseq/tree/master/examples/translation#multilingual-translation exactly as they are written, but during generation it gives the “missing --lang-pairs” error. I then add --lang-pairs de-en,fr-en, and it gives the error: TypeError: load_state_dict() got an unexpected keyword argument 'args'

Extra info: I even tried adding an “args” argument to the load_state_dict() method infairseq/models/multilingual_transformer.py, but then it gives the error:

 Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/bin/fairseq-interactive", line 11, in <module>
    load_entry_point('fairseq', 'console_scripts', 'fairseq-interactive')()
  File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 190, in cli_main
    main(args)
  File "/home/ubuntu/NMT_project/fairseq_cli/interactive.py", line 149, in main
    translations = task.inference_step(generator, models, sample)
  File "/home/ubuntu/NMT_project/fairseq/tasks/multilingual_translation.py", line 309, in inference_step
    if self.args.decoder_langtok else self.target_dictionary.eos(),
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 113, in generate
    return self._generate(model, sample, **kwargs)
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 295, in _generate
    tokens[:, :step + 1], encoder_outs, temperature=self.temperature,
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 553, in forward_decoder
    temperature=temperature,
  File "/home/ubuntu/NMT_project/fairseq/sequence_generator.py", line 583, in _decode_one
    decoder_out = list(model.forward_decoder(
  File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 591, in __getattr__
    type(self).__name__, name))
AttributeError: 'MultilingualTransformerModel' object has no attribute 'forward_decoder'

0reactions

myleottcommented, Dec 19, 2019

Should be fixed with 4c5934ac61354d9b6d164f7317905e4ac2ae1064

Read more comments on GitHub >

`Top Results From Across the Web`

Error with mutilingual model and fairseq-generate · Issue #1393

Hi, I got an error when generating with a trained multilingual model. I hope you can help me understand what went wrong and...

Read more >

fairseq/examples/multilingual/README.md · ICML2022/OFA at ...

```bash model=<multilingual model> source_lang=<source language> target_lang=<target language> fairseq-generate $path_2_data \ --path $model \ --task ...

Read more >

Hi I am new to fairseq S2T translation - Facebook

Hello fairseq users. While trying to run the wav2vec based on transformers with model. I'm getting the given error. If anyone has already...

Read more >

fairseq - PyPI

Fairseq (-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling ...

Read more >

Tasks — fairseq 0.12.2 documentation

setup the task (e.g., load dictionaries) task = fairseq.tasks.setup_task(args) # build model and criterion model = task.build_model(args) criterion ...

Read more >

`Top Related Medium Post`

No results found

`Top Related StackOverflow Question`

No results found

`Troubleshoot Live Code`

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

`Top Related Reddit Thread`

No results found

`Top Related Hackernoon Post`

No results found

`Top Related Tweet`

No results found

`Top Related Dev.to Post`

No results found

`Top Related Hashnode Post`

No results found

Previous page
Difficulties to reproduce CNN/DM results with BART Next page
Evaluating BART on CNN/DM : How to process dataset