question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bug occurs when using the huggingface seq2seq model using the inference engine.

See original GitHub issue

I’m trying to deploy some language models using DeepSpeed’s inference engine. Currently, I am trying to deploy a seq2seq langauge model, I have succeeded in parallelizing the model. However, when I try to generate, Then the following error occurred.

import time
import torch

from torch.distributed import get_rank
from deepspeed import InferenceEngine
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer


model_name = 'facebook/bart-large'
model = AutoModelForSeq2SeqLM.from_pretrained(model_name).half()
tokenizer = AutoTokenizer.from_pretrained(model_name)

model = InferenceEngine(
    model=model,
    mp_size=2,
    dtype=torch.half,
)

torch.cuda.empty_cache()

tokens = tokenizer.encode(
    "Hello",
    return_tensors="pt",
    truncation=True,
    padding=True,
).cuda()

output = model.generate(
    tokens,
    min_length=30,
    max_length=31,
)  # <--- problem

if get_rank() == 0:
    print(f"Output: {tokenizer.decode(output.tolist()[0])}")

deepspeed --num_gpus=2 inference.py
RuntimeError: Tensors must be non-overlapping and dense

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
aphedgescommented, Jun 28, 2021

I am getting the same error when using the num_return_sequences parameter with GPT-Neo. I ran deepspeed --num_gpus 4 test.py with the following code in test.py:

import os

import deepspeed
import torch
from transformers import pipeline

local_rank = int(os.getenv("LOCAL_RANK", "0"))
world_size = int(os.getenv("WORLD_SIZE", "1"))
pipe = pipeline(
    "text-generation",
    model="EleutherAI/gpt-neo-2.7B",
    framework="pt",
    device=local_rank,
)
pipe.model = deepspeed.init_inference(pipe.model, mp_size=world_size, dtype=torch.float32)
output = pipe("I am very", do_sample=True, num_return_sequences=10)
if torch.distributed.get_rank() == 0:
    print(output)

I get RuntimeError: Tensors must be non-overlapping and dense errors when running deepspeed==0.4.0, but I can confirm that #1168 fixes the issue for me.

This code was run with Python 3.7.9, torch==1.8.1, and transformers==4.6.1.

0reactions
RezaYazdaniAminabadicommented, Jun 22, 2021

Hi @hyunwoongko

Thanks for investigating the issue for these new models 👍 I will test the branch and merge this soon 😃

Reza

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot - Hugging Face
Troubleshoot. Sometimes errors occur, but we are here to help! This guide covers some of the most common issues we've seen and how...
Read more >
Pipelines - Hugging Face
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex...
Read more >
What to do when you get an error - Hugging Face Course
In this section we'll look at some common errors that can occur when you're trying to generate predictions from your freshly tuned Transformer...
Read more >
Model trains with Seq2SeqTrainer but gets stuck using Trainer
Hi, I've been trying to finetune the BART large pre-trained on MNLI with the Financial Phrasebank dataset to build a model for news ......
Read more >
RAG - Hugging Face
RAG is a seq2seq model which encapsulates two core components: a question encoder and a generator. During a forward pass, we encode the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found