Error with mutilingual model and fairseq-generate
See original GitHub issueHi, I got an error when generating with a trained multilingual model. I hope you can help me understand what went wrong and how to fix it. Some context: I’m basically trying to use the multilingual architecture as a multitask model to combine different datasets for a monolingual task (each task is a “language” pair).
The command used for training a many-to-one model (i.e. shared decoder) is:
CUDA_VISIBLE_DEVICES=1,2,3 fairseq-train "${data_dir}/bin" \
--ddp-backend=no_c10d \
--task multilingual_translation --lang-pairs orig-simp,complex-simp,long-simp \
--arch multilingual_transformer \
--share-decoders --share-decoder-input-output-embed \
--encoder-embed-path "${glove}" --encoder-embed-dim 300 --encoder-ffn-embed-dim 300 \
--decoder-embed-path "${glove}" --decoder-embed-dim 300 --decoder-ffn-embed-dim 300 \
--encoder-attention-heads 5 --decoder-attention-heads 5 \
--encoder-layers 4 --decoder-layers 4 \
--optimizer adam --adam-betas '(0.9, 0.98)' \
--lr 0.0005 --lr-scheduler inverse_sqrt --min-lr '1e-09' \
--label-smoothing 0.1 --dropout 0.3 --weight-decay 0.0001 \
--criterion label_smoothed_cross_entropy --max-update 10000 \
--warmup-updates 4000 --warmup-init-lr '1e-07' \
--max-tokens 4000 --update-freq 4 \
--save-dir "${model_dir}" --tensorboard-logdir "${log_dir}" \
Training proceeds without problems. Now, I want to generate the output for the ‘test’ subset of one of the “language” pairs (orig-simp) that the model was trained on.
fairseq-generate "${data_dir}/bin" \
--path "${model_dir}/${checkpoint_name}.pt" \
--lang-pairs orig-simp,complex-simp,long-simp \
--task multilingual_translation --source-lang orig --target-lang simp \
--batch-size 128 --beam 5 --remove-bpe=sentencepiece \
--gen-subset test > "${experiment_dir}/outputs/${output_name}.out"
After running the command I get the following error:
/experiments/falva/tools/fairseq/fairseq/models/fairseq_model.py:280: UserWarning: FairseqModel is deprecated, please use FairseqEncoderDecoderModel or BaseFairseqModel instead
for key in self.keys
Traceback (most recent call last):
File "/home/falva/anaconda3/envs/mtl4ts/bin/fairseq-generate", line 11, in <module>
load_entry_point('fairseq', 'console_scripts', 'fairseq-generate')()
File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 190, in cli_main
main(args)
File "/experiments/falva/tools/fairseq/fairseq_cli/generate.py", line 47, in main
task=task,
File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 167, in load_model_ensemble
ensemble, args, _task = load_model_ensemble_and_task(filenames, arg_overrides, task)
File "/experiments/falva/tools/fairseq/fairseq/checkpoint_utils.py", line 186, in load_model_ensemble_and_task
model.load_state_dict(state['model'], strict=True, args=args)
TypeError: load_state_dict() got an unexpected keyword argument 'args'
Could you help me understand what’s going on?
Some additional and perhaps useful information and questions:
- For all ‘language’ pairs, all dataset splits (train/valid/test) were binarized before training using
fairseq-preprocess
. That’s why I decided to usefairseq-generate
instead offairseq-interactive
. I don’t think this could be the source of the problem, right? Or is there a particular reason why, for the multilingual model, it’s recommended to useinteractive
rather thangenerate
as in the example you provide in your repo? - Since in this case I’m using a many-to-one model (just as in the example you provide), there is no need to use the
--encoder-langtok
or--decoder-langtok
arguments. To my understanding,--encoder-langtok
comes into play if I wanted to train a one-to-many model (--encoder-langtok tgt
). But, when would--decoder-langtok
be necessary in your experience?
Thank you in advance for all the help.
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
+1. I ran the commands shown in https://github.com/pytorch/fairseq/tree/master/examples/translation#multilingual-translation exactly as they are written, but during generation it gives the “missing
--lang-pairs
” error. I then add--lang-pairs de-en,fr-en
, and it gives the error:TypeError: load_state_dict() got an unexpected keyword argument 'args'
Extra info: I even tried adding an “
args
” argument to theload_state_dict()
method infairseq/models/multilingual_transformer.py
, but then it gives the error:Should be fixed with 4c5934ac61354d9b6d164f7317905e4ac2ae1064