Gibberish text generation after converting to Huggingface.
See original GitHub issueHi, I am having trouble converting my checkpoints to Huggingface format. The model works fine when using Deepspeed + Megatron (example).
generate_samples_from_prompt(neox_args, model, ['สวัสดีครับ' for i in range(1)],temperature=0.9, top_k=40)
>> {'context': 'สวัสดีครับ', ## means 'hello'
'text': 'เพื่อนๆครับ มีเพื่อนๆ คนไหนที่ทำงานแล้ว หรือกำลังทำงานแล้ว แล้วได้ลาออกจากงานไปแล้วแต่ยังหางานอยู่บ้างครับ พอดีอยากทราบวิธีหางานหรือแนะนำบริษัท ที่ให้เงินเดือนดี และน่าเชื่อถือหน่อยครับ'}
However, it becomes gibberish when converted into Huggingface format (example).
pipe = TextGenerationPipeline(model, tok, device=0)
pipe("สวัสดีครับ",max_new_tokens=50, top_k=40,do_sample=True, temperature=0.9)
>> [{'generated_text': 'สวัสดีครับ ค.. ดี. แรง 1-". และทำ<|endoftext|>'}]
I have tried multiple conversion scripts so far (e.g., this and this) without success.
All the related files (weights, config, and tokenizer) are in my google drive.
Any help is greatly appreciated!
Issue Analytics
- State:
- Created a year ago
- Comments:8 (6 by maintainers)
Top Results From Across the Web
What can cause model.generate (BART) output to be ...
What can cause model.generate (BART) output to be gibberish after fine-tuning? ... repetitive gibberish instead! Here's the generation code:
Read more >Generation - Hugging Face
Generates sequences of token ids for models with a language modeling head using greedy decoding and can be used for text-decoder, text-to-text, speech-to-text, ......
Read more >How to generate text: using different decoding methods for ...
We will give a tour of the currently most prominent decoding methods, mainly Greedy search, Beam search, Top-K sampling and Top-p sampling.
Read more >Faster Text Generation with TensorFlow and XLA
TL;DR: Text Generation on 🤗 transformers using TensorFlow can now be compiled with XLA. It is up to 100x faster than before, and...
Read more >Controlled Text Generation - Hugging Face Forums
I am trying to perform in context learning with GPT-Neo and I have noticed that it's hard to get the text generation pipeline...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Still working on finding the possible issue here–I’ll keep you posted!
Thank you!! I’ll try this to convert your checkpoint as soon as I can, hopefully later today or early tomorrow!