Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why change sequence order of prev_output_tokens in BART?

See original GitHub issue

Hi,

In the extract_features example of BART (link to code), both src_tokens (for encoder) and prev_output_tokens (for decoder) are fed into model, but prev_output_tokens is just the src_tokens with EOS being moved to the beginning. I think during training (teacher forcing) the prev_output_tokens starts with BOS, doesn’t it? So what’s the purpose of this reorder? To add some deliberate noise, or assuming there’s another sentence before it?

Thanks in advance.

Rui

Issue Analytics

State:
Created 4 years ago
Comments:16 (8 by maintainers)

Top GitHub Comments

4reactions

ngoyal2707commented, Nov 20, 2019

https://github.com/pytorch/fairseq/blob/master/fairseq/data/language_pair_dataset.py#L63 is the fixed link. Ahh I see, so actually the input format to encoder for above case is actually:

<s> A <mask> B <mask> E </s>

and Decoder input:

</s> <s> A B C D E

And target:

<s> A B C D E </s>

We will look into updating those figures. Thanks for pointing it out.

3reactions

yinhanliucommented, Dec 24, 2019

@Colanim @villmow

our release cnn/dm model was fine-tuned with bos. But you can totally remove bos during fine-tune and it works as well. The only thing you want to make sure is that if you don’t have bos in fine-tune then don’t use it in test, if you do use in fine-tune, then keep it in test.
Pretrain doesn’t need bos at all. Bos was a concept from Roberta, and turned out not necessary at all.

Top Results From Across the Web

BART schedule changes 9/12/22 with several improvements

On September 12, 2022, BART's schedule is changing to provide better spaced apart trains, add earlier train options, end scheduled single ...

System Facts - BART.gov

Additionally, there are over 80 bill-to-bill change machines located throughout the BART system. These machines break $10 and $20 bills into $5 denominations, ......

BART schedule change begins 2/14/22, extending service to ...

BART's schedule will change on Monday, February 14, 2022, with significant improvements to Sunday service.

Frequently Asked Questions (FAQs) - BART.gov

A: BART communicates cancelled trains in advance on our website and the official app in the station specific “real time departures,” “schedule by...

Using BART - BART.gov

Welcome to BART!Bay Area Rapid Transit (BART) connects the San Francisco Peninsula with communities in the East Bay and South Bay.