question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

error loading facebook/opt-30b with text generation pipeline using 8bit mixed precision

See original GitHub issue

System Info

  • transformers version: 4.24.0
  • Platform: Linux-5.4.0-109-generic-x86_64-with-glibc2.10
  • Python version: 3.8.13
  • Huggingface_hub version: 0.11.0
  • PyTorch version (GPU?): 1.13.0a0+d0d6b1f (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

@patrickvonplaten, @Narsil, @gante

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)

Reproduction

Running the following on a system with one (NVIDIA A5000) GPU:

from transformers import pipeline
model = "facebook/opt-30b"
model_kwargs = {"device_map": "auto", "load_in_8bit": True}
generator = pipeline(task="text-generation", model=model, device=0, model_kwargs=model_kwargs)

yields error: ValueError: Could not load model facebook/opt-30b with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForCausalLM'>, <class 'transformers.models.opt.modeling_opt.OPTForCausalLM'>).

Expected behavior

Should be able to create generator with no problem and generate text with generator.__call__.

The code works with no error when using smaller opt model checkpoints: “facebook/opt-2.7b”, “facebook/opt-6.7b”.

Can create model, tokenizer, and generate without pipeline using AutoModelForCausalLM.from_pretrained(model, device_map="auto") with model="facebook/opt-30b" despite the error message.

Issue Analytics

  • State:closed
  • Created 10 months ago
  • Comments:12 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
morrisalpcommented, Nov 22, 2022

Thanks! I mainly wanted to see what the largest LLM I could fit on one of my GPUs would be using mixed precision, and I couldn’t tell previously if the 30B model would be OOM due to the other errors…

1reaction
sguggercommented, Nov 22, 2022

Please provide the full traceback, as we can’t see what’s happening otherwise especially since I can’t reproduce locally on my side. cc @younesbelkada who might have better luck reproducing the bug!

Read more comments on GitHub >

github_iconTop Results From Across the Web

[1905.12334] Mixed Precision Training With 8-bit Floating Point
In this paper, we propose a method to train deep neural networks using 8-bit floating point representation for weights, activations, errors, ...
Read more >
What to do when you get an error - Hugging Face Course
In this section we'll look at some common errors that can occur when you're trying to generate predictions from your freshly tuned Transformer...
Read more >
MIXED PRECISION TRAINING WITH 8-BIT FLOATING POINT
In this paper, we propose a method to train deep neural networks using 8-bit floating point representation for weights, activations, errors, and gradients....
Read more >
Scaling DeepSpeech using Mixed Precision and KubeFlow
Our audio development pipeline contains various deep learning models trained on large volumes of audio and text datasets, to feed features to our...
Read more >
Mixed precision training — Catalyst 22.04 documentation
Catalyst support a variety of backends for mixed precision training. After PyTorch 1.6 release, it's possible to use AMP natively inside torch.amp package....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found