Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Trying to reproduce speech translation results on Must-C

See original GitHub issue

❓ Questions and Help

What is your question?

I am trying to reproduce the speech translation results you give on Must-C, at the foot of https://github.com/pytorch/fairseq/blob/master/examples/speech_to_text/docs/mustc_example.md .

If I download the pre-trained models (thanks for releasing these!), I get the same bleu scores on the languages I have checked. But training my own models resulted in lower scores, so I compared my configuration against the pre-trained models.

The first problem is that, if I use the command recommended in the documentation, it sets label-smoothing to 0.0 (the default). In the pre-trained models it is set to 0.1, and indeed setting label-smoothing to 0.1 improves bleu by 1-2 points - could the documentation be updated with this setting?

My best scores are still lower than the pre-trained models (26.5 vs 27.2 for en-es and 21.5 vs 22.6 for en-de). I noticed that the pre-trained models use label_smoothed_cross_entropy_with_accuracy as their loss function, but this is not available in the current fairseq (as far as I can see). So my question is, what is the label_smoothed_cross_entropy_with_accuracy loss, and did it improve performance over using label_smoothed_cross_entropy ?

best Barry

What’s your environment?

fairseq Version: master
PyTorch Version: 1.8.1
OS: Linux
How you installed fairseq (pip, source):
Build command you used (if compiling from source):
Python version:
CUDA/cuDNN version:
GPU models and configuration:
Any other relevant information:

Issue Analytics

State:
Created 2 years ago
Reactions:3
Comments:12 (1 by maintainers)

Top GitHub Comments

2reactions

gegallegocommented, Feb 14, 2022

Hi, I was quite surprised about why I was getting always around 1 BLEU below the official results in MuST-C. So, I’ve been checking for updates in this issue for a while.

I’ve just discovered that I wasn’t doing the checkpoint average. I don’t know why my eyes were jumping those lines in the README! Doing it closed the gap between the official results and mine.

It may seem obvious, I know, it’s just following the instructions… but I leave the comment here anyway, in case someone has forgotten about this step too!

2reactions

bhaddowcommented, Apr 20, 2021

@muhdhuz - I had to add --model-overrides '{ "criterion" : "label_smoothed_cross_entropy"}' to fairseq-generate in order to translate with the pretrained models.