Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Using function "generate()" to generate text based on casual language model like GPT2 will repeat the input in the begining.

See original GitHub issue

System Info

This issue has nothing to do with the relevant system information.

Who can help?

@patrickvonplaten, @Narsil, @gante

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, …)
My own task or dataset (give details below)

Reproduction

Just 2 kind of examples illustrated in your code. 1.Multinomial Sampling:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> import torch
>>> tokenizer = AutoTokenizer.from_pretrained("gpt2")
>>> model = AutoModelForCausalLM.from_pretrained("gpt2")
>>> prompt = "Today I believe we can finally"
>>> input_ids = tokenizer(prompt, return_tensors="pt").input_ids
>>> # sample up to 30 tokens
>>> torch.manual_seed(0)  # doctest: +IGNORE_RESULT
>>> outputs = model.generate(input_ids, do_sample=True, max_length=30)
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Today I believe we can finally get rid of discrimination," said Rep. Mark Pocan (D-Wis.).\n\n"Just look at the']

2.Beam-search decoding:

>>> from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
>>> tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-de")
>>> model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-de")
>>> sentence = "Paris is one of the densest populated areas in Europe."
>>> input_ids = tokenizer(sentence, return_tensors="pt").input_ids
>>> outputs = model.generate(input_ids, num_beams=5)
>>> tokenizer.batch_decode(outputs, skip_special_tokens=True)
['Paris ist eines der dichtesten besiedelten Gebiete Europas.']
```"""

### Expected behavior

I found that, when I used casual language model and "generate()" to generate a text output based on the given text input, there were 2 kinds of situations: 1. if the model is good enough and the input is reasonable, the output will totally repeat the input at the begining; 2.if the model is not good enough or the input is not reasonable, the output will still try to repeat the input but it can't generate the same content as input at the begining.


However, what I want is a output without repeating input, I don't know what parameter I need to set to achieve this. Although I know that this kind of language model will generate output token by token, but I just wonder if we can avoid to repeating input in output with your code.

Issue Analytics

State:
Created a year ago
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

Narsilcommented, Oct 28, 2022

Just for future readers:

pipelines: from raw string to raw string
generate from input_ids tensors to output_ids tensor

generate doesn’t have the option to “cut” the input_ids, it really operates on what the model sees, which are all the ids. pipeline on the other hand is designed to work as much as possible out of the box for non ML users, so it will add some magic for you sometimes (like here cutting the input which is annoying when writing an autocomplete workflow for instance.

0reactions

Zcchillcommented, Oct 28, 2022

You might want to use

Much Thanks, I’ve read through the source code of “pipeline” in detail, it seems that it call the “generate()” and do the mentioned process through the following code in “postprocess()”: if return_type == ReturnType.FULL_TEXT: all_text = prompt_text + text[prompt_length:] else: all_text = text[prompt_length:] Acatully, I’m confused of the diference between “pipeline(‘text-generation’)” and “generate()” the days before, now I’m more clearly. I’ll close the issue, wish you have a good day.