Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot generate translation using m2m_100

See original GitHub issue

Greetings,

I was trying to deploy and use the m2m_100 model. I tried to do so on a p3.8xlarge instance since the model loading will eat all the memory on the p3.2xlarge.

With CUDA version 11.0 using deep learning ubuntu AMI Deep Learning Base AMI (Ubuntu 18.04) Version 30.0 , and using the latest pip version of torch, and installed fairseq and fairscale using git. When trying to reproduce the generation results, I keep getting CUDA Out-of-Memory errors. Even when I tried adding --cpu it kept trying to use the GPUs and kept exiting because of CUDA memory issues.

Please advise, Thanks,

Issue Analytics

State:
Created 3 years ago
Comments:11 (3 by maintainers)

Top GitHub Comments

2reactions

shruti-bhcommented, Nov 9, 2020

@dk-et Can you check if you get the same error with commit 6debe291 on the master branch?

2reactions

shruti-bhcommented, Oct 27, 2020

@AbdallahNasir Please refer to the latest version of the README for checkpoints and pipeline arguments to use with different hardware configurations.

About fine-tuning, that’s a great point. We don’t have support yet for loading a pretrained model and fine-tuning it, but we will try to add this soon.

Top Results From Across the Web

M2M100 - Hugging Face

In this work, we create a true Many-to-Many multilingual translation model that can translate directly between any pair of 100 languages. We build...

Add m2m 100 multilingual translation model from FAIR #8054

All I've done is download the state dict, run their command, and asked for help m2m: generate OOMs on v100 facebookresearch/fairseq#2772 ...

Running FairSeq M2M-100 machine translation model in CPU ...

This guide describes the steps for running Facebook FairSeq m2m_100 multilingual translation model in CPU-only environment.

arXiv:2109.05611v2 [cs.CL] 15 Sep 2021

tion (NMT) model trained to generate translations by starting with an empty output ... periment with the M2M-100-small initialization.

Neural Machine Translation using Hugging Face Pipeline

Hugging Face Pipeline object abstracts the three steps of model inference namely tokenization, generating target sequence ids and decoding ...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Cannot generate translation using m2m_100

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Wav2Vec2.0 state of saved model missing in checkpoint after pre-training and code breaks in fine-tuning.

Segmentation fault when training speech_to_text model following instruction in examples/speech_to_text