Reproduce the results of Transformer-Base on WMT17 En-Zh
See original GitHub issueHi,
I want to reproduce the results of Transformer-base model, which can achieve about 34 BLEU on WMT17 En->Zh translation. The transformer-base model was trained with the following scripts:
fairseq-train \
data-bin/wmt17_en_zh \
--source-lang en --target-lang zh \
--arch transformer_wmt_en_de
--save-dir checkpoints \
--ddp-backend=no_c10d \
--criterion label_smoothed_cross_entropy \
--optimizer adam --adam-betas '(0.9,0.98)' --clip-norm 0.0 \
--lr 0.0005 --lr-scheduler inverse_sqrt \
--min-lr '1e-09' --warmup-updates 4000 \
--warmup-init-lr '1e-07' --label-smoothing 0.1 \
--dropout 0.3 --weight-decay 0.0 \
--log-format 'simple' --log-interval 100 \
--fixed-validation-seed 7 \
--max-tokens 8000 \
--save-interval-updates 10000 \
--max-update 300000
--fp16
I tested the performance of transformer-base with following script:
fairseq-generate \
data-bin/wmt17_en_zh \
--path checkpoints/checkpoint_best.pt \
--beam 5 --remove-bpe \
--batch-size 200 \
I trained transform-base model using 8 GPUs with a total batch size 64k. However, it can only obtain 20 BLEU on test set, while using the same training script, the resulted transformer-base model on WMT17 Zh->En can achieve 24 BLEU, comparable to Hassan et al. (2018)'s implementation.
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (2 by maintainers)
Top Results From Across the Web
Reproduce the results of Transformer-Base on WMT17 En-Zh
Hi, I want to reproduce the results of Transformer-base model, which can achieve about 34 BLEU on WMT17 En->Zh translation.
Read more >Results of the WMT17 Metrics Shared Task - ACL Anthology
This paper presents the results of the WMT17 Metrics Shared Task. We asked participants of this task to score the out- puts of...
Read more >See raw diff - Hugging Face
+ +We manage to reproduce their result in fairseq and keep most of the +[original implementation](https://github.com/facebookresearch/adaptive-span) ...
Read more >Johns Hopkins University Submission for WMT News ...
Table 1 : Reproduction of Microsoft's replication of the University of Edinburgh's submission to WMT17, using the Transformer-base model. Scores are re-.
Read more >2017 Second Conference on Machine Translation (WMT17)
Participants in the shared evaluation task will use their automatic evaluation metrics to score the output from the translation task and the NMT...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @xwgeng Could you share your data preprocessing script? Thanks!
Sorry for late reply. Please refer to the repo https://github.com/xwgeng/WMT17-scripts @suyuzhang