model.generate does not work when using a FlaxGPTNeoForCausalLM model in PT (flax-community-event)
See original GitHub issueEnvironment info
transformers
version: 4.9.0.dev0- Platform: Linux-5.4.0-1043-gcp-x86_64-with-glibc2.29
- Python version: 3.8.10
- PyTorch version (GPU?): 1.9.0+cpu (False)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): 0.3.4 (tpu)
- Jax version: 0.2.16
- JaxLib version: 0.1.68
- Using GPU in script?: no
- Using distributed or parallel set-up in script?: no
Who can help
Models
FlaxGPTNeoForCausalLM, GPTNeoForCausalLM
Information
I have finetuned a FlaxGPTNeoForCausalLM model on the provided TPU and I’m trying to translate it to PT and generate text, but I’m unable to make it work. These are the steps I followed:
model = GPTNeoForCausalLM.from_pretrained('gptneo-125M-finetuned', from_flax=True)`
tokenizer = AutoTokenizer.from_pretrained('EleutherAI/gpt-neo-125M', use_fast=True)
text = 'A house with three bedrooms'
input_ids = tokenizer(text)
model.generate(input_ids, do_sample=True, top_p=0.84, top_k=100, max_length=100)
and the stack trace:
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/transformers/src/transformers/tokenization_utils_base.py in __getattr__(self, item)
241 try:
--> 242 return self.data[item]
243 except KeyError:
KeyError: 'new_ones'
During handling of the above exception, another exception occurred:
AttributeError Traceback (most recent call last)
/tmp/ipykernel_387004/1910535609.py in <module>
----> 1 model.generate(input_ids, do_sample=True, top_p=0.84, top_k=100, max_length=300)
~/neo/lib/python3.8/site-packages/torch/autograd/grad_mode.py in decorate_context(*args, **kwargs)
26 def decorate_context(*args, **kwargs):
27 with self.__class__():
---> 28 return func(*args, **kwargs)
29 return cast(F, decorate_context)
30
~/transformers/src/transformers/generation_utils.py in generate(self, input_ids, max_length, min_length, do_sample, early_stopping, num_beams, temperature, top_k, top_p, repetition_penalty, bad_words_ids, bos_token_id, pad_token_id, eos_token_id, length_penalty, no_repeat_ngram_size, encoder_no_repeat_ngram_size, num_return_sequences, max_time, max_new_tokens, decoder_start_token_id, use_cache, num_beam_groups, diversity_penalty, prefix_allowed_tokens_fn, output_attentions, output_hidden_states, output_scores, return_dict_in_generate, forced_bos_token_id, forced_eos_token_id, remove_invalid_values, synced_gpus, **model_kwargs)
906 if model_kwargs.get("attention_mask", None) is None:
907 # init `attention_mask` depending on `pad_token_id`
--> 908 model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
909 input_ids, pad_token_id, eos_token_id
910 )
~/transformers/src/transformers/generation_utils.py in _prepare_attention_mask_for_generation(self, input_ids, pad_token_id, eos_token_id)
402 if is_pad_token_in_inputs_ids and is_pad_token_not_equal_to_eos_token_id:
403 return input_ids.ne(pad_token_id).long()
--> 404 return input_ids.new_ones(input_ids.shape, dtype=torch.long)
405
406 def _prepare_encoder_decoder_kwargs_for_generation(
~/transformers/src/transformers/tokenization_utils_base.py in __getattr__(self, item)
242 return self.data[item]
243 except KeyError:
--> 244 raise AttributeError
245
246 def __getstate__(self):
AttributeError:
As always, being new to all this, I’m fairly certain I missed something obvious 😃 But in the case I didn’t, I thought I’d share and see what you all think.
Thanks!
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Generation - Hugging Face
Loading and using a generation configuration file does not change a model configuration or weights. It only affects the model's behavior at generation...
Read more >Accelerating Hugging Face and TIMM models with PyTorch 2.0
Accelerating Hugging Face and TIMM models with PyTorch 2.0 ... This example won't actually run faster but it's educational.
Read more >Hugging Face Pre-trained Models: Find the Best One for Your ...
There are two ways to start working with the Hugging Face NLP library: either using pipeline or any available pre-trained model by repurposing...
Read more >Constrained Beam Search with Transformers
from transformers import GPT2LMHeadModel, GPT2Tokenizermodel ... The problem is that beam search generates the sequence token-by-token.
Read more >How to remove input from from generated text in GPTNeo?
The Transformers library does not provide you with a way to do it, but this is something you can easily achieve with 1...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @TheodoreGalanos From your code snippet it seems the issue that the tokenizer is not returning pytorch tensors
This returns a dict like
BatchEncoding
object with keysinput_ids and
attention_mask, which are lists. To get tensors one should pass the
return_tensorsargument and set it to
pt`, so it’ll return PyTorch tensorsSo the attribute error is caused by passing the
BatchEncoding
object instead of tensors.This should fix it.
Hi @StellaAthena
tokenizer returns a
dict
like objectBatchEncoding
, so hereinput_ids
is not atensor
but aBatchEncoding
. Andgenerate
expects the first argumentinput_ids
to be atensor
.So here, we could get the
input_ids
using theinput_ids
attribute on theBatchEncoding
objector as it’s a
dict
like object, we could also pass it as kwargs