PhraseConstraints apearing only directly after input or at the end of the generated sentence
See original GitHub issueSystem Info
transformers
version: 4.22.0- Platform: Linux-3.10.0-1160.25.1.el7.x86_64-x86_64-with-glibc2.17
- Python version: 3.9.12
- Huggingface_hub version: 0.9.1
- PyTorch version (GPU?): 1.12.1 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?: Yes
- Using distributed or parallel set-up in script?: No
Who can help?
@patrickvonplaten @Narsil @cwkeam
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, …) - My own task or dataset (give details below)
Reproduction
Overview
In the PR that introduced word constraints to the generation function we have an example script --> Example 2: A Mix of Strong Constraint and a Disjunctive Constraint. Following up you see it slightly modified, but the modifications should not have an impact on the output
- I added the import for
GPT2LMHeadModel
andGPT2Tokenizer
- I removed the
.to(torch_device)
for me to run the script - I redid the assertions, so we can run the script on its own --> removing
self.....
from transformers import GPT2LMHeadModel, GPT2Tokenizer
model = GPT2LMHeadModel.from_pretrained("gpt2")
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
force_word = "scared"
force_flexible = ["scream", "screams", "screaming", "screamed"]
force_words_ids = [
tokenizer([force_word], add_prefix_space=True, add_special_tokens=False).input_ids,
tokenizer(force_flexible, add_prefix_space=True, add_special_tokens=False).input_ids,
]
starting_text = ["The soldiers", "The child"]
input_ids = tokenizer(starting_text, return_tensors="pt").input_ids
outputs = model.generate(
input_ids,
force_words_ids=force_words_ids,
num_beams=10,
num_return_sequences=1,
no_repeat_ngram_size=1,
remove_invalid_values=True,
)
generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
assert generated_text[0] == "The soldiers, who were all scared and screaming at each other as they tried to get out of the"
assert generated_text[1] == "The child was taken to a local hospital where she screamed and scared for her life, police said."
ToDo
- run the script on
transformers==4.20.1
it works perfectly well - run the script on a version above
4.20.1
it will not pass the assertions
Expected behavior
Problem
The constraining algorithm seems to be somewhat broken in versions above 4.20.1
For example on version 4.22
we the script generates the following the outputs:
The soldiers, who had been stationed at the base for more than a year before being evacuated screaming scared The child was taken to a local hospital where he died.\n 'I don’t think screaming scared
You can see that the constraints just get added to the end of the generated sentence. In fact, when trying around with constraints, I found out, that they are either placed right after the input: –> example is made up to show what happens…
_The soldiers screaming scared, who had been stationed at the base for more than a year before being evacuated _ The child screaming scared was taken to a local hospital where he died.\n 'I don’t think
or at the end of the generated sentence:
The soldiers, who had been stationed at the base for more than a year before being evacuated screaming scared The child was taken to a local hospital where he died.\n 'I don’t think screaming scared
- I expect for the constraints to appear naturally within the generated sentence (like in the testing-script). On versions above
4.20.1
they are just appended in a senseless manner?
- hope that helps
- pls ask me if you have further questions, through I am a beginner myself
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:13 (7 by maintainers)
Top GitHub Comments
Reopened (it’s still on my generate task queue, which sadly is quite long) 😃
@gante more generally should we maybe mark the disjunctive decoding as experimental and state that we don’t actively maintain them? It’s simply too time-consuming to look into this at the moment IMO