Getting special characters in model generation
See original GitHub issueHello. I fine tuned my DialoGPT small model and trained it on Daily Dialogue database. When speaking with the model, on its last utterance generation, the outputs has random punctuation characters. This is my generation:
from transformers import AutoModelWithLMHead, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained('microsoft/DialoGPT-small')
model = AutoModelWithLMHead.from_pretrained('output-daily')
raw = ""
while raw != "END":
for step in range(4):
raw = input(">> User:")
if raw == "END":
break
new_user_input_ids = tokenizer.encode(raw + tokenizer.eos_token, return_tensors='pt')
bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids
chat_history_ids = model.generate(
bot_input_ids,
max_length=150,
do_sample=True,
temperature = 0.7,
top_k=50,
top_p=0.92,
no_repeat_ngram_size=3,
pad_token_id=tokenizer.eos_token_id
)
print("Bot: {}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))
bot_input_ids = []
print("\n")
The output:
>> User:Hello, what are you doing?
Bot: I'm just watching TV.
>> User:What is your favorite movie?
Bot: My favorite movie is Superbad.
>> User:I don't know it.
Bot: I like comedies best.
>> User:Me too!
Bot: !!!( At the end of the movie, O. K. )!!!!!1!!@!!excuse me!!( at the end, O'K. )!!excuses me!!Excuse me!(At the end )!exc!!automatics!!remotes!!Automatics!rematics!autos!!Remotes!autop!!Automatics!!autoblog!!log!autobi
>> User:Me too!
Bot: What do you think of this dress?
>> User:It's amazing!
Bot: I can't believe it.
>> User:Yes I lie.
Bot: What does it look like?
>> User:A potato.
Bot: !!!( After trying )!!!!!
Issue Analytics
- State:
- Created 2 years ago
- Comments:9
Top Results From Across the Web
YANG model Special Characters includes @ - Stack Overflow
The first character must be an underscore or a letter, and may be followed by letters, digits, underscores, dots and hyphens. An identifier...
Read more >Besides Word Embedding, why you need to know Character ...
In the paper, a list of character are defined 70 characters which including 26 English letters, 10 digits, 33 special characters and new...
Read more >Special Characters - Salesforce Help
Certain characters have a special meaning in CRM Analytics. Character Name Description ' Single quote Encloses a dataset column name in a predicate...
Read more >Displaying special characters in the console - Cloud - 8.0
About this task Talend Studio can display special characters in the console. To enable the display of Chinese, Japanese or Korean characters, for...
Read more >How to work with special characters in Illustrator
Right click and choose Insert Special Character from the context menu. Choose one of the following options: Symbols, Hyphens And Dashes, and ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Changing the number of chat rounds kept in memory proved to solve the issue most of the time, however, it was not as reliable as I needed it to be. As per my response on Sep 17, I have instead taken the length of the tensor into account and using a ‘hacky’ fix like the one below I was able to get it to work without freaking out at all.
To be absolutely honest I did not pursue this line of thinking since I managed to get it working well enough for my implementation. If adding EoS manually will make it behave properly, I do not know.
Since it breaks after step 3/4 , a potential hacky solution is maintain a queue with a fixed length of 3 maybe which stores past inputs and outputs and use them rather than the whole history, although some context is lost this would allow the chatbot to run endlessly without breaking down and keeping some context rather than none as in step hardcoded to 0.
When you say EoS is not added, is there a way to add it manually? Like after every response we add EoS , would that fix the issue?