How to generate sentences in batches, instead of generating sentences one by one
See original GitHub issueAfter I finetune GPT-2, I want to use it to generate sentences in batches instead of one by one.
So I tried to modify the code of examples/text-generation/run_generation.py
.
the code on line 239 in run_generation.py
is:
encoded_prompt = tokenizer.encode(prompt_text, add_special_tokens=False, return_tensors="pt")
the prompt_text
is a str
type data, but when I modify it to List[str]
type data, It always return 50256
.
But looking at the source code, the type of prompt_text
can be str
, List[str]
or List[int]
.
I tested this example separately , and for token_ids
it always returns 50256.
So, the prompt_text
must be str
type data?
What modifications should I make to generate sentences in batches using examples/text-generation/run_generation.py
?
Looking forward to your reply!
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (2 by maintainers)
Yes! Please take a look at this test, which does batch=4 generation for summarization using T5: https://github.com/huggingface/transformers/blob/55cb2ee62eb482787cff17585955f7193fe35dfa/tests/test_modeling_t5.py#L559
Hey, @patrickvonplaten is batch generation available for T5conditiongeneration?