question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multilingual_translation error at inference

See original GitHub issue

For the preprocessing, you can use the same commands in the documentation for each language pair, for example:

fairseq-preprocess --source-lang de --target-lang en
–trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test
–destdir data-bin/ then you execute the same command for the second language pair:

fairseq-preprocess --source-lang it --target-lang en
–trainpref $TEXT/train --validpref $TEXT/valid --testpref $TEXT/test
–destdir data-bin/

As for the training, here is a sample command that I used: python train.py \raw-data\data-bin --task multilingual_translation --criterion label_smoothed_cross_entropy --arch multilingual_transformer --max-epoch 26 --lr 1.0 --wd 0.5 --lang-pairs de-en,it-en --encoder-layers 2 --decoder-layers 2 --save-dir data\checkpoints --optimizer sgd

Inference: fairseq-interactive \raw-data\data-bin --task multilingual_translation --source-lang it --target-lang en --path \checkpoints\checkpoint20.pt --input \raw-data\test.it --beam 5

For the inference however, I am stuck at this error: Traceback (most recent call last):

File “c:.…\lib\site-packages\fairseq_cli\interactive.py”, line 82, in main args.path.split('😂, task, model_arg_overrides=eval(args.model_overrides), File “c:.…\lib\site-packages\fairseq\utils.py”, line 164, in load_ensemble_for_inference model = task.build_model(args) File “c:.…\lib\site-packages\fairseq\tasks\multilingual_translation.py”, line 180, in build_model model = models.build_model(args, self) File “c:.…\lib\site-packages\fairseq\models_init_.py”, line 33, in build_model return ARCH_MODEL_REGISTRY[args.arch].build_model(args, task) File “c:.…\lib\site-packages\fairseq\models\multilingual_transformer.py”, line 162, in build_model encoders[lang_pair] = shared_encoder if shared_encoder is not None else get_encoder(src) File “c:.…\lib\site-packages\fairseq\models\multilingual_transformer.py”, line 137, in get_encoder task.dicts[lang], args.encoder_embed_dim, args.encoder_embed_path KeyError: ‘de’

It seems like it requires the second source language (de), but I really don’t know why, and how to solve this. I hope someone tells me what I am missing.

_Originally posted by @AyaNsar in https://github.com/pytorch/fairseq/issues/497#issuecomment-462551184_

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

3reactions
pipibjccommented, Feb 12, 2019

Thanks for reporting the error. This is a bug, and I will fix the problem shortly.

1reaction
AyaNsarcommented, Feb 15, 2019

can’t thank you enough! It works like a charm

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error in multilingual translation example during inference #656
For example, in the example page of multilingual translation, we train a model that translates German and French into English. --lang-pairs ...
Read more >
``Bilingual Expert" Can Find Translation Errors
“Bilingual Expert” Can Find Translation Errors ... However, in inference only the source sentence s ... Instead of exact inference, we propose a...
Read more >
Breaking Down Multilingual Machine Translation - OpenReview
Our analysis sheds light on how multilingual translation models work and also enables us to propose methods to improve performance by training ...
Read more >
Modelling Latent Translations for Cross-Lingual Transfer - arXiv
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks, including commonsense reasoning, paraphrase ...
Read more >
A Unified Strategy for Multilingual Grammatical Error ... - IJCAI
In this paper, we propose a generic and language-independent strategy for multilingual. GEC, which can train a GEC system effectively for a new...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found