Using function "generate()" to generate text based on casual language model like GPT2 will repeat the input in the begining.
See original GitHub issueSystem Info
This issue has nothing to do with the relevant system information.
Who can help?
@patrickvonplaten, @Narsil, @gante
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
Just 2 kind of examples illustrated in your code. 1.Multinomial Sampling:
>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> import torch
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> model = AutoModelForCausalLM.from_pretrained("gpt2")
>>> prompt = "Today I believe we can finally"
>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids
>>> # sample up to 30 tokens
>>> torch.manual_seed(0) # doctest: +IGNORE_RESULT
>>> outputs = model.generate(input_ids, do_sample=True, max_length=30)
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Today I believe we can finally get rid of discrimination," said Rep. Mark Pocan (D-Wis.).\n\n"Just look at the']
2.Beam-search decoding:
>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-de")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-de")
>>> sentence = "Paris is one of the densest populated areas in Europe."
>>> input_ids = tokenizer(sentence, return_tensors="pt").input_ids
>>> outputs = model.generate(input_ids, num_beams=5)
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Paris ist eines der dichtesten besiedelten Gebiete Europas.']
```"""
### Expected behavior
I found that, when I used casual language model and "generate()" to generate a text output based on the given text input, there were 2 kinds of situations: 1. if the model is good enough and the input is reasonable, the output will totally repeat the input at the begining; 2.if the model is not good enough or the input is not reasonable, the output will still try to repeat the input but it can't generate the same content as input at the begining.
However, what I want is a output without repeating input, I don't know what parameter I need to set to achieve this. Although I know that this kind of language model will generate output token by token, but I just wonder if we can avoid to repeating input in output with your code.
Issue Analytics
- State:
- Created a year ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
How to generate text: using different decoding methods for ...
We will give a tour of the currently most prominent decoding methods, mainly Greedy search, Beam search, Top-K sampling and Top-p sampling.
Read more >Text Generation with a GPT-2 Model - Oursky Code
We share our experience in designing an AI text generator built with a GPT-2 model from basic text generation concepts.
Read more >Text Generation With GPT-2 in Python | Towards Data Science
Learn how to build a high-quality natural language generation model in Python using OpenAI's GPT-2 model. A straightforward guide to easy ...
Read more >GPT-2 - Wikipedia
Generative Pre-trained Transformer 2 (GPT-2) is an open-source artificial intelligence created by OpenAI in February 2019. GPT-2 translates text, ...
Read more >The Ultimate Guide to OpenAI's GPT-3 Language Model - Twilio
Generative Pre-trained Transformer 3 (GPT-3) is a new language model created by OpenAI that is able to generate written text of such quality ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Just for future readers:
pipelines
: from raw string to raw stringgenerate
from input_ids tensors to output_ids tensorgenerate
doesn’t have the option to “cut” the input_ids, it really operates on what the model sees, which are all the ids.pipeline
on the other hand is designed to work as much as possible out of the box for non ML users, so it will add some magic for you sometimes (like here cutting the input which is annoying when writing an autocomplete workflow for instance.Much Thanks, I’ve read through the source code of “pipeline” in detail, it seems that it call the “generate()” and do the mentioned process through the following code in “postprocess()”:
if return_type == ReturnType.FULL_TEXT: all_text = prompt_text + text[prompt_length:] else: all_text = text[prompt_length:]
Acatully, I’m confused of the diference between “pipeline(‘text-generation’)” and “generate()” the days before, now I’m more clearly. I’ll close the issue, wish you have a good day.