question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

setting max_new_tokens in text-generation pipeline with OPT produces error

See original GitHub issue

System Info

python 3.7.12 transformers 4.22.2 Google Vertex AI platform

Who can help?

@LysandreJik (Feel free to tag whoever owns OPT if that’s not you! – it’s not specified in the list)

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, …)
  • My own task or dataset (give details below)

Reproduction

from transformers import pipeline

test_generator = pipeline(
    "text-generation", 
    model="facebook/opt-125m", 
    do_sample=True,
    device=device
)

response = test_generator(
    "Here's how this model responds to a test prompt:",
    max_new_tokens=200,
    num_return_sequences=1,
)
print(response[0]['generated_text'])

Expected behavior

This should generate text, but it produces this error:

ValueError: Both max_new_tokens and max_length have been set but they serve the same purpose – setting a limit to the generated output length. Remove one of those arguments. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)

Meanwhile, the official documentation specifically recommends setting ‘max_new_tokens’ rather than ‘max_length’:

max_length (int, optional, defaults to model.config.max_length) — The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. In general, prefer the use of max_new_tokens, which ignores the number of tokens in the prompt. max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.

The problem can be worked around by manually setting max_length=None, but that should happen by default as it does with other autoregressive models. The same code runs without error if you swap out the OPT model for EleutherAI/gpt-neo-125M.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
gantecommented, Oct 10, 2022

Hey @gqfiddler 👋 – thank you for raising this issue 👀

@Narsil this seems to be a problem between how .generate() expects the max length to be defined, and how the text-generation pipeline prepares the inputs. When max_new_tokens is passed outside the initialization, this line merges the two sets of sanitized arguments (from the initialization we have max_length, from the new kwargs we have max_new_tokens).

To fix this, we can either remove the ValueError from generate (but expose ourselves to weird errors) or add more logic to the pipelines e.g. to ignore max_length when max_new_tokens are is (which is not very pretty). WDYT?

0reactions
github-actions[bot]commented, Nov 6, 2022

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Read more comments on GitHub >

github_iconTop Results From Across the Web

error loading facebook/opt-30b with text generation pipeline ...
yields error : ValueError: Could not load model facebook/opt-30b with any of the following classes: (<class ...
Read more >
Pipelines - Hugging Face
The pipelines are a great and easy way to use models for inference. These pipelines are objects that abstract most of the complex...
Read more >
Transformers model from Hugging-Face throws error that ...
Calling pipeline() selects the framework (TF or PyTorch) based on what is installed on your machine (or venv in my case) ...
Read more >
Text generation with GPT-2 - Model Differently
In this post we will see how to generate text with models based on the Transformers architecture, and we will use this knowledge...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found