Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

setting max_new_tokens in text-generation pipeline with OPT produces error

See original GitHub issue

System Info

python 3.7.12 transformers 4.22.2 Google Vertex AI platform

Who can help?

@LysandreJik (Feel free to tag whoever owns OPT if that’s not you! – it’s not specified in the list)

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, …)
My own task or dataset (give details below)

Reproduction

from transformers import pipeline

test_generator = pipeline(
    "text-generation", 
    model="facebook/opt-125m", 
    do_sample=True,
    device=device
)

response = test_generator(
    "Here's how this model responds to a test prompt:",
    max_new_tokens=200,
    num_return_sequences=1,
)
print(response[0]['generated_text'])

Expected behavior

This should generate text, but it produces this error:

ValueError: Both max_new_tokens and max_length have been set but they serve the same purpose – setting a limit to the generated output length. Remove one of those arguments. Please refer to the documentation for more information. (https://huggingface.co/docs/transformers/main/en/main_classes/text_generation)

Meanwhile, the official documentation specifically recommends setting ‘max_new_tokens’ rather than ‘max_length’:

max_length (int, optional, defaults to model.config.max_length) — The maximum length the generated tokens can have. Corresponds to the length of the input prompt + max_new_tokens. In general, prefer the use of max_new_tokens, which ignores the number of tokens in the prompt. max_new_tokens (int, optional) — The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.

The problem can be worked around by manually setting max_length=None, but that should happen by default as it does with other autoregressive models. The same code runs without error if you swap out the OPT model for EleutherAI/gpt-neo-125M.

Issue Analytics

State:
Created a year ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

gantecommented, Oct 10, 2022

Hey @gqfiddler 👋 – thank you for raising this issue 👀

@Narsil this seems to be a problem between how .generate() expects the max length to be defined, and how the text-generation pipeline prepares the inputs. When max_new_tokens is passed outside the initialization, this line merges the two sets of sanitized arguments (from the initialization we have max_length, from the new kwargs we have max_new_tokens).

To fix this, we can either remove the ValueError from generate (but expose ourselves to weird errors) or add more logic to the pipelines e.g. to ignore max_length when max_new_tokens are is (which is not very pretty). WDYT?

0reactions

github-actions[bot]commented, Nov 6, 2022

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.