Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Correct parameters and training model from the paper "Training Neural Machine Translation To Apply Terminology Constraints"

See original GitHub issue

According to the paper Training Neural Machine Translation To Apply Terminology Constraints I added the given parameters from the appendix in order to create their described model.

The Train data are the given Europarl and the validation (dev) data are the News-Commentary data. I processed the data in the same way as in the Multilingual Zero-shot Translation IWSLT 2017 , because it’s the same way of processing like in this paper. For the batch-size I chose a higher parameter.

Do you agree with the parameters and data processing according to the paper or did I forget something? I know I could leave out some parameters, because their values are already default, but I want to make sure that these parameters are all complete.

sockeye-train -d train_data \
-vs data/news-commentary-v13.tag.src \
-vt data/news-commentary-v13.tag.trg \
--shared-vocab \
-o amazon_model \
--transformer-attention-heads 8:8 \
--transformer-activation-type 'relu':'relu' \
--transformer-dropout-act 0.1:0.1 \
--transformer-dropout-attention 0.1:0.1 \
--transformer-dropout-prepost 0.1:0.1 \
--dtype float32 \
--transformer-feed-forward-num-hidden 2048:2048 \
--max-seq-len 101:101 \
--transformer-model-size 512:512 --num-layers 2:2 \
--transformer-positional-embedding-type fixed \
--transformer-postprocess dr:dr \
--transformer-preprocess n:n \
--embed-dropout 0.0:0.0 \
--label-smoothing 0.1 \
--loss cross-entropy \
--num-words 32302:32302 \ 
--num-embed 512:512 \
--source-factors-num-embed 1 \ 
--target-factors-num-embed 1 \
--min-num-epochs 50 \
--max-num-epochs 100 \
--batch-size 560

Issue Analytics

State:
Created 2 years ago
Comments:15 (7 by maintainers)

Top GitHub Comments

1reaction

fhiebercommented, Jan 4, 2022

You should not apply BPE after providing source factors. Source factors should be added to the bpe-ed input that is fed into Sockeye. In your command above your echo input already seems to be BPE-tokenized (since it contains ‘@@’ markers).

0reactions

fhiebercommented, Jan 11, 2022

Happy to hear it is working for you now.