Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to reproduce the result of WMT14 en-de on transformer BASE model?

See original GitHub issue

I want to replicate the WMT14 en-de translation result on transformer BASE model of the paper “attention is all you need”. Following the last instructions here, I downloaded and preprocessed the data. Then I trained the model with this:

CUDA_VISIBLE_DEVICES=0,1,2,3  python train.py data-bin/wmt16_en_de_bpe32k \
        --arch transformer_wmt_en_de --share-all-embeddings \
          --optimizer adam --adam-betas '(0.9, 0.98)' --clip-norm 0.0 \
            --lr-scheduler inverse_sqrt --warmup-init-lr 1e-07 --warmup-updates 4000 \
              --lr 0.0005 --min-lr 1e-09 \
             --criterion label_smoothed_cross_entropy --label-smoothing 0.1 --weight-decay 0.0\
              --max-tokens  4096   --save-dir checkpoints/en-de\
              --update-freq 2 --no-progress-bar --log-format json --log-interval 50\
             --save-interval-updates  1000 --keep-interval-updates 20

I averaged last 5 checkpoints and generated the translation with this:

model=model.pt
subset="test"
  
   CUDA_VISIBLE_DEVICES=0 python generate.py data-bin/wmt16_en_de_bpe32k  \
         --path checkpoints/$model --gen-subset $subset\
           --beam 4 --batch-size 128 --remove-bpe  --lenpen 0.6

However, after about 120k updates, I got :
| Generate test with beam=4: BLEU4 = 26.38, 57.8/32.0/20.0/13.1 (BP=1.000, ratio=1.020, syslen=64352, reflen=63078)

After about 250k updates, I got: | Generate test with beam=4: BLEU4 = 26.39, 57.8/32.0/20.0/13.1 (BP=1.000, ratio=1.017, syslen=64123, reflen=63078)

Far away from the result in “attention is all you need”(27.3). Can you think of any reasons for this? Thanks a lot!

Issue Analytics

State:
Created 5 years ago
Reactions:10
Comments:23 (7 by maintainers)

Top GitHub Comments

10reactions

myleottcommented, Nov 7, 2018

Great! The last step to reproduce results from Vaswani et al. is to split compound words. This step gives a moderate increase in BLEU but is somewhat hacky. In general it’s preferable to report detokenized BLEU via tools like sacrebleu, although detok. BLEU is usually lower than tokenized BLEU. See this paper: https://arxiv.org/abs/1804.08771

Here is the script: https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/utils/get_ende_bleu.sh The compound splitting is near the bottom of the script.

8reactions

gushu333commented, Nov 7, 2018

That’s so interesting! After using this script, I got: BLEU = 27.70, 58.9/33.4/21.2/14.1 (BP=1.000, ratio=1.015, hyp_len=65442, ref_len=64496) Meanwhile, I find that the BLEU score of the averaged model which has about 180k updates has already achieved: BLEU = 27.37, 58.6/33.0/21.0/13.8 (BP=1.000, ratio=1.016, hyp_len=65500, ref_len=64496) Thanks again for your help! 👍

Top Results From Across the Web

How to reproduce the result of WMT14 en-de on transformer ...

Hi I want to replicate the WMT14 en-de translation result on transformer BASE model of the paper "attention is all you need".

arXiv:1908.05672v5 [cs.CL] 20 Jun 2022

Our proposed CTNMT consists of three techniques: a) asymptotic distillation to ensure that the NMT model can retain the previous pre-trained ...

UNDERSTANDING KNOWLEDGE DISTILLATION IN NON ...

In our experiments, we first run beam search using the base Transformer model with a beam size of 5 then select the sentences...

Towards Efficient Neural Machine Translation

standard transformer-base [178] model with a vocabulary of 40,000 tokens ... Neural Machine Translation (NMT) is an end-to-end structure which could ...

Context-Aware Self-Attention Networks

mental results on WMT14 English⇒German and WMT17 ... To this end, we employ the internal ... TRANSFORMER-BASE with context-aware model achieves.