Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reproduce the results of Transformer-Base on WMT17 En-Zh

See original GitHub issue

Hi,

I want to reproduce the results of Transformer-base model, which can achieve about 34 BLEU on WMT17 En->Zh translation. The transformer-base model was trained with the following scripts:

fairseq-train \
    data-bin/wmt17_en_zh \
    --source-lang en --target-lang zh \
    --arch transformer_wmt_en_de
    --save-dir checkpoints \
    --ddp-backend=no_c10d \
    --criterion label_smoothed_cross_entropy \
    --optimizer adam --adam-betas '(0.9,0.98)' --clip-norm 0.0 \
    --lr 0.0005 --lr-scheduler inverse_sqrt \
    --min-lr '1e-09' --warmup-updates 4000 \
    --warmup-init-lr '1e-07' --label-smoothing 0.1 \
    --dropout 0.3 --weight-decay 0.0 \
    --log-format 'simple' --log-interval 100 \
    --fixed-validation-seed 7 \
    --max-tokens 8000 \
    --save-interval-updates 10000 \
    --max-update 300000
    --fp16

I tested the performance of transformer-base with following script:

fairseq-generate \ 
    data-bin/wmt17_en_zh \
    --path checkpoints/checkpoint_best.pt \
    --beam 5 --remove-bpe \
    --batch-size 200 \

I trained transform-base model using 8 GPUs with a total batch size 64k. However, it can only obtain 20 BLEU on test set, while using the same training script, the resulted transformer-base model on WMT17 Zh->En can achieve 24 BLEU, comparable to Hassan et al. (2018)'s implementation.

Issue Analytics

State:
Created 4 years ago
Comments:8 (2 by maintainers)

Top GitHub Comments

1reaction

suyuzhangcommented, Mar 9, 2021

Hi @xwgeng Could you share your data preprocessing script? Thanks!

0reactions

xwgengcommented, Aug 13, 2021

Sorry for late reply. Please refer to the repo https://github.com/xwgeng/WMT17-scripts @suyuzhang

Top Results From Across the Web

Reproduce the results of Transformer-Base on WMT17 En-Zh

Hi, I want to reproduce the results of Transformer-base model, which can achieve about 34 BLEU on WMT17 En->Zh translation.

Results of the WMT17 Metrics Shared Task - ACL Anthology

This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task to score the out- puts of...

See raw diff - Hugging Face

+ +We manage to reproduce their result in fairseq and keep most of the +[original implementation](https://github.com/facebookresearch/adaptive-span) ...

Johns Hopkins University Submission for WMT News ...

Table 1 : Reproduction of Microsoft's replication of the University of Edinburgh's submission to WMT17, using the Transformer-base model. Scores are re-.

2017 Second Conference on Machine Translation (WMT17)

Participants in the shared evaluation task will use their automatic evaluation metrics to score the output from the translation task and the NMT...