question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Convert 12-1 and 6-1 en-de models from AllenNLP

See original GitHub issue

https://github.com/jungokasai/deep-shallow#download-trained-deep-shallow-models

  • These should be FSMT models, so can be part of #6940 or done after.
  • They should be uploaded to the AllenNLP namespace. If stas takes this, they can start in stas/ and I will move them.
  • model card(s) should link to the original repo and paper.
  • I hope same en-de tokenizer already ported.
  • Would be interesting to compare BLEU to the initial models in that PR. There is no ensemble so we should be able to reported scores pretty well.
  • Ideally this requires 0 lines of checked in python code, besides maybe an integration test.

Desired Signature:

model = FSMT.from_pretrained('allen_nlp/en-de-12-1')

Weights can be downloaded with gdown https://pypi.org/project/gdown/

pip install gdown
gdown https://drive.google.com/uc?id=1x_G2cjvM1nW5hjAB8-vWxRqtQTlmIaQU

@stas00 if you are blocked in the late stages of #6940 and have extra cycles, you could give this a whirl. We could also wait for that to be finalized and then either of us can take this.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:19 (18 by maintainers)

github_iconTop GitHub Comments

1reaction
sshleifercommented, Sep 11, 2020

New model’s config should be {'num_beams':5} according to https://github.com/jungokasai/deep-shallow#evaluation

1reaction
sshleifercommented, Sep 10, 2020
  • since different dataset, different val set, which implies that 28 and 41 are not comparable BLEU scores.
  • These models should be significantly faster than the Marian models at similar performance levels.
  • We can finetune them on the new data if we think that will help.
  • FYI Helinki-NLP/opus-mt-en-de trained on way more data than the fairseq model I think, not totally sure.
Read more comments on GitHub >

github_iconTop Results From Across the Web

Writing Code for NLP Research
This is not a tutorial about AllenNLP. ○ But (obviously, seeing as we wrote it). AllenNLP represents our experiences and opinions about how...
Read more >
Learning From Instructions
Models equipped with "understanding" language instructions, should successfully solve any unseen task, if they are provided with the task instructions.
Read more >
A pipeline for large raw text preprocessing and model training ...
Ideally, we would like to have an end-to-end pipeline, from data collection to model evaluation, although the scope of the project is limited...
Read more >
Converting the Point of View of Messages Spoken to Virtual ...
The first step of our POV conversion model, therefore, ... Figure 1: The end-to-end architecture of the rule-based model. Message Classification.
Read more >
Vision Transformer (ViT): Tutorial + Baseline - Kaggle
I'll briefly introduce you to how transformers work before getting into the details of the topic at hand - ViT. ViT-Illustration. If you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found