question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot generate translation using m2m_100

See original GitHub issue

Greetings,

I was trying to deploy and use the m2m_100 model. I tried to do so on a p3.8xlarge instance since the model loading will eat all the memory on the p3.2xlarge.

With CUDA version 11.0 using deep learning ubuntu AMI Deep Learning Base AMI (Ubuntu 18.04) Version 30.0 , and using the latest pip version of torch, and installed fairseq and fairscale using git. When trying to reproduce the generation results, I keep getting CUDA Out-of-Memory errors. Even when I tried adding --cpu it kept trying to use the GPUs and kept exiting because of CUDA memory issues.

Please advise, Thanks,

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:11 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
shruti-bhcommented, Nov 9, 2020

@dk-et Can you check if you get the same error with commit 6debe291 on the master branch?

2reactions
shruti-bhcommented, Oct 27, 2020

@AbdallahNasir Please refer to the latest version of the README for checkpoints and pipeline arguments to use with different hardware configurations.

About fine-tuning, that’s a great point. We don’t have support yet for loading a pretrained model and fine-tuning it, but we will try to add this soon.

Read more comments on GitHub >

github_iconTop Results From Across the Web

M2M100 - Hugging Face
In this work, we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages. We build...
Read more >
Add m2m 100 multilingual translation model from FAIR #8054
All I've done is download the state dict, run their command, and asked for help m2m: generate OOMs on v100 facebookresearch/fairseq#2772 ...
Read more >
Running FairSeq M2M-100 machine translation model in CPU ...
This guide describes the steps for running Facebook FairSeq m2m_100 multilingual translation model in CPU-only environment.
Read more >
arXiv:2109.05611v2 [cs.CL] 15 Sep 2021
tion (NMT) model trained to generate translations by starting with an empty output ... periment with the M2M-100-small initialization.
Read more >
Neural Machine Translation using Hugging Face Pipeline
Hugging Face Pipeline object abstracts the three steps of model inference namely tokenization, generating target sequence ids and decoding ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found