Error when using GPT2 `model.forward` method with DeepSpeed inference
See original GitHub issueHi,
I am trying to use DeepSpeed with GPT-2/Neo for inference, but encountering an error when using model.forward
method directly. It seems that the error occurs if I don’t provide an attention_mask
.
import deepspeed
from transformers import GPT2Tokenizer, GPT2LMHeadModel
import torch
tokenizer = GPT2Tokenizer.from_pretrained("gpt2-medium")
model = GPT2LMHeadModel.from_pretrained("gpt2-medium")
model = model.cuda()
model = deepspeed.init_inference(model,
mp_size=1,
dtype=torch.float,
replace_method='auto')
ids = tokenizer.encode("A valley full of unicorns was discovered",
add_special_tokens=False, return_tensors="pt").cuda()
# attn = torch.ones((1, 8), dtype=torch.long).cuda()
output = model(
input_ids=ids,
# attention_mask=attn, # <-- need to provide this
return_dict=True,
use_cache=True
)
print(output)
Traceback (most recent call last):
File "deepspeed_attn.py", line 25, in <module>
use_cache=True
File "torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "deepspeed/inference/engine.py", line 222, in forward
return self.module(*inputs, **kwargs)
File "torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "transformers/models/gpt2/modeling_gpt2.py", line 954, in forward
return_dict=return_dict,
File "torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "transformers/models/gpt2/modeling_gpt2.py", line 797, in forward
output_attentions=output_attentions,
File "torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "deepspeed/ops/transformer/inference/transformer_inference.py", line 611, in forward
self.norm_b)
File "torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "deepspeed/ops/transformer/inference/transformer_inference.py", line 393, in forward
self.qkv_merging)
File "deepspeed/ops/transformer/inference/transformer_inference.py", line 141, in forward
while len(input_mask.shape) < 4:
AttributeError: 'NoneType' object has no attribute 'shape'
When I use model.generate
, as in the example scripts, I am not getting any errors
as generate
takes care of creating attention_mask
(but in general I think model.forward
should work even without providing attention_mask
):
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (6 by maintainers)
Top Results From Across the Web
[BUG] CUDA error with INT 8 inference · Issue #1788 - GitHub
I am trying to get started with implementing INT 8 inference on Deepspeed. But I am running into RuntimeError: CUDA error: an illegal...
Read more >DeepSpeed Integration - Hugging Face
We're on a journey to advance and democratize artificial intelligence through open source and open science.
Read more >Zero Redundancy Optimizer - DeepSpeed
In this tutorial, we will apply the ZeRO optimizer to the Megatron-LM GPT-2 model. ZeRO is a powerful set of memory optimization techniques...
Read more >DeepSpeed - Release 0.7.7 Microsoft
user wants to use torch distributed calls before calling deepspeed.initialize(), ... Set to true to inject inference kernels for models such as, Bert,...
Read more >Accelerated Training for Transformer-based Models on GPUs
focus on model inference or optimization for only BERT-like ... 25% of the time when training Transformer-Big using PyTorch ... Otherwise, the error....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks! Just tested it on my end as well and it works both with and without
attention_mask
.I see. I will fix it soon. Thanks for trying it out 👍