Convert 12-1 and 6-1 en-de models from AllenNLP
See original GitHub issuehttps://github.com/jungokasai/deep-shallow#download-trained-deep-shallow-models
- These should be FSMT models, so can be part of #6940 or done after.
- They should be uploaded to the AllenNLP namespace. If stas takes this, they can start in stas/ and I will move them.
- model card(s) should link to the original repo and paper.
- I hope same en-de tokenizer already ported.
- Would be interesting to compare BLEU to the initial models in that PR. There is no ensemble so we should be able to reported scores pretty well.
- Ideally this requires 0 lines of checked in python code, besides maybe an integration test.
Desired Signature:
model = FSMT.from_pretrained('allen_nlp/en-de-12-1')
Weights can be downloaded with gdown https://pypi.org/project/gdown/
pip install gdown
gdown https://drive.google.com/uc?id=1x_G2cjvM1nW5hjAB8-vWxRqtQTlmIaQU
@stas00 if you are blocked in the late stages of #6940 and have extra cycles, you could give this a whirl. We could also wait for that to be finalized and then either of us can take this.
Issue Analytics
- State:
- Created 3 years ago
- Comments:19 (18 by maintainers)
Top Results From Across the Web
Writing Code for NLP Research
This is not a tutorial about AllenNLP. ○ But (obviously, seeing as we wrote it). AllenNLP represents our experiences and opinions about how...
Read more >Learning From Instructions
Models equipped with "understanding" language instructions, should successfully solve any unseen task, if they are provided with the task instructions.
Read more >A pipeline for large raw text preprocessing and model training ...
Ideally, we would like to have an end-to-end pipeline, from data collection to model evaluation, although the scope of the project is limited...
Read more >Converting the Point of View of Messages Spoken to Virtual ...
The first step of our POV conversion model, therefore, ... Figure 1: The end-to-end architecture of the rule-based model. Message Classification.
Read more >Vision Transformer (ViT): Tutorial + Baseline - Kaggle
I'll briefly introduce you to how transformers work before getting into the details of the topic at hand - ViT. ViT-Illustration. If you...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
New model’s config should be
{'num_beams':5}
according to https://github.com/jungokasai/deep-shallow#evaluationHelinki-NLP/opus-mt-en-de
trained on way more data than the fairseq model I think, not totally sure.