Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

model.generate does not work when using a FlaxGPTNeoForCausalLM model in PT (flax-community-event)

See original GitHub issue

Environment info

transformers version: 4.9.0.dev0
Platform: Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29
Python version: 3.8.10
PyTorch version (GPU?): 1.9.0+cpu (False)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): 0.3.4 (tpu)
Jax version: 0.2.16
JaxLib version: 0.1.68
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help

@patil-suraj

Models

FlaxGPTNeoForCausalLM, GPTNeoForCausalLM

Information

I have finetuned a FlaxGPTNeoForCausalLM model on the provided TPU and I’m trying to translate it to PT and generate text, but I’m unable to make it work. These are the steps I followed:

model = GPTNeoForCausalLM.from_pretrained('gptneo-125M-finetuned', from_flax=True)`
tokenizer = AutoTokenizer.from_pretrained('EleutherAI/gpt-neo-125M', use_fast=True)
text = 'A house with three bedrooms'
input_ids = tokenizer(text)
model.generate(input_ids, do_sample=True, top_p=0.84, top_k=100, max_length=100)

and the stack trace:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/transformers/src/transformers/tokenization_utils_base.py in __getattr__(self, item)
    241         try:
--> 242             return self.data[item]
    243         except KeyError:

KeyError: 'new_ones'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_387004/1910535609.py in <module>
----> 1 model.generate(input_ids, do_sample=True, top_p=0.84, top_k=100, max_length=300)

~/neo/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
     26         def decorate_context(*args, **kwargs):
     27             with self.__class__():
---> 28                 return func(*args, **kwargs)
     29         return cast(F, decorate_context)
     30 

~/transformers/src/transformers/generation_utils.py in generate(self, input_ids, max_length, min_length, do_sample, early_stopping, num_beams, temperature, top_k, top_p, repetition_penalty, bad_words_ids, bos_token_id, pad_token_id, eos_token_id, length_penalty, no_repeat_ngram_size, encoder_no_repeat_ngram_size, num_return_sequences, max_time, max_new_tokens, decoder_start_token_id, use_cache, num_beam_groups, diversity_penalty, prefix_allowed_tokens_fn, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, forced_bos_token_id, forced_eos_token_id, remove_invalid_values, synced_gpus, **model_kwargs)
    906         if model_kwargs.get("attention_mask", None) is None:
    907             # init `attention_mask` depending on `pad_token_id`
--> 908             model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
    909                 input_ids, pad_token_id, eos_token_id
    910             )

~/transformers/src/transformers/generation_utils.py in _prepare_attention_mask_for_generation(self, input_ids, pad_token_id, eos_token_id)
    402         if is_pad_token_in_inputs_ids and is_pad_token_not_equal_to_eos_token_id:
    403             return input_ids.ne(pad_token_id).long()
--> 404         return input_ids.new_ones(input_ids.shape, dtype=torch.long)
    405 
    406     def _prepare_encoder_decoder_kwargs_for_generation(

~/transformers/src/transformers/tokenization_utils_base.py in __getattr__(self, item)
    242             return self.data[item]
    243         except KeyError:
--> 244             raise AttributeError
    245 
    246     def __getstate__(self):

AttributeError:

As always, being new to all this, I’m fairly certain I missed something obvious 😃 But in the case I didn’t, I thought I’d share and see what you all think.

Thanks!

Issue Analytics

State:
Created 2 years ago
Comments:7 (5 by maintainers)

Top GitHub Comments

4reactions

patil-surajcommented, Jul 6, 2021

Hi @TheodoreGalanos From your code snippet it seems the issue that the tokenizer is not returning pytorch tensors

text = 'A house with three bedrooms'
input_ids = tokenizer(text)

This returns a dict like BatchEncoding object with keys input_ids and attention_mask, which are lists. To get tensors one should pass the return_tensorsargument and set it topt`, so it’ll return PyTorch tensors

So the attribute error is caused by passing the BatchEncoding object instead of tensors.

This should fix it.

inputs =  tokenizer(text, return_tensors="pt")
model.generate(**inputs, ....)

2reactions

patil-surajcommented, Sep 22, 2021

Hi @StellaAthena

input_ids = tokenizer(text, return_tensors="pt")
model.generate(input_ids, do_sample=True, top_p=0.84, top_k=100, max_length=20)

tokenizer returns a dict like object BatchEncoding, so here input_ids is not a tensor but a BatchEncoding. And generate expects the first argument input_ids to be a tensor.

So here, we could get the input_ids using the input_ids attribute on the BatchEncoding object

input_ids = tokenizer(text, return_tensors="pt").input_ids
model.generate(input_ids, do_sample=True, top_p=0.84, top_k=100, max_length=20)

or as it’s a dict like object, we could also pass it as kwargs

inputs =  tokenizer(text, return_tensors="pt")
model.generate(**inputs, do_sample=True, top_p=0.84, top_k=100, max_length=20)

Top Results From Across the Web

Generation - Hugging Face

Loading and using a generation configuration file does not change a model configuration or weights. It only affects the model's behavior at generation...

Accelerating Hugging Face and TIMM models with PyTorch 2.0

Accelerating Hugging Face and TIMM models with PyTorch 2.0 ... This example won't actually run faster but it's educational.

Hugging Face Pre-trained Models: Find the Best One for Your ...

There are two ways to start working with the Hugging Face NLP library: either using pipeline or any available pre-trained model by repurposing...

Constrained Beam Search with Transformers

from transformers import GPT2LMHeadModel, GPT2Tokenizermodel ... The problem is that beam search generates the sequence token-by-token.

How to remove input from from generated text in GPTNeo?

The Transformers library does not provide you with a way to do it, but this is something you can easily achieve with 1...