setting max_new_tokens in text-generation pipeline with OPT produces error
See original GitHub issueSystem Info
python 3.7.12 transformers 4.22.2 Google Vertex AI platform
Who can help?
@LysandreJik (Feel free to tag whoever owns OPT if that’s not you! – it’s not specified in the list)
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
from transformers import pipeline
test_generator = pipeline(
"text-generation",
model="facebook/opt-125m",
do_sample=True,
device=device
)
response = test_generator(
"Here's how this model responds to a test prompt:",
max_new_tokens=200,
num_return_sequences=1,
)
print(response[0]['generated_text'])
Expected behavior
This should generate text, but it produces this error:
ValueError: Both max_new_tokens
and max_length
have been set but they serve the same purpose – setting a limit to the generated output length. Remove one of those arguments. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)
Meanwhile, the official documentation specifically recommends setting ‘max_new_tokens’ rather than ‘max_length’:
max_length (int, optional, defaults to model.config.max_length) — The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. In general, prefer the use of max_new_tokens, which ignores the number of tokens in the prompt. max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
The problem can be worked around by manually setting max_length=None, but that should happen by default as it does with other autoregressive models. The same code runs without error if you swap out the OPT model for EleutherAI/gpt-neo-125M.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
Hey @gqfiddler 👋 – thank you for raising this issue 👀
@Narsil this seems to be a problem between how
.generate()
expects the max length to be defined, and how thetext-generation
pipeline prepares the inputs. Whenmax_new_tokens
is passed outside the initialization, this line merges the two sets of sanitized arguments (from the initialization we havemax_length
, from the new kwargs we havemax_new_tokens
).To fix this, we can either remove the
ValueError
from generate (but expose ourselves to weird errors) or add more logic to the pipelines e.g. to ignoremax_length
whenmax_new_tokens
are is (which is not very pretty). WDYT?This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.