Can't Generate from Pretrained Story Models
See original GitHub issueI ran the following command from the examples stories tutorial using the pretrained checkpoints and couldn’t get it to work. What is the correct command to generate from the pretrained story model? I saw https://github.com/pytorch/fairseq/issues/285 had a similar question, but I wasn’t sure if it was resolved and if so what the correct command was.
python generate.py data-bin/writingPrompts --path data-bin/models/fusion_checkpoint.pt --batch-size 32 --beam 1 --sampling --sampling-topk 10 --sampling-temperature 0.8 --nbest 1 --model-overrides "{'pretrained_checkpoint':'data-bin/models/pretrained_checkpoint.pt'}"
My error is pasted below.
| [wp_target] dictionary: 104960 types
| data-bin/writingPrompts test 15138 examples
| ['data-bin/writingPrompts'] test 15138 examples
| loading model(s) from data-bin/models/fusion_checkpoint.pt
| loading pretrained model
0%| | 0/474 [00:00<?, ?it/s]Traceback (most recent call last):
File "generate.py", line 171, in <module>
main(args)
File "generate.py", line 104, in main
for sample_id, src_tokens, target_tokens, hypos in translations:
File "/juicier/scr121/scr/hughz/fairseq/fairseq/sequence_generator.py", line 95, in generate_batched_itr
prefix_tokens=s['target'][:, :prefix_size] if prefix_size > 0 else None,
File "/juicier/scr121/scr/hughz/fairseq/fairseq/sequence_generator.py", line 117, in generate
return self._generate(encoder_input, beam_size, maxlen, prefix_tokens)
File "/juicier/scr121/scr/hughz/fairseq/fairseq/sequence_generator.py", line 143, in _generate
encoder_out = model.encoder.reorder_encoder_out(encoder_out, new_order)
File "/juicier/scr121/scr/hughz/fairseq/fairseq/models/composite_encoder.py", line 48, in reorder_encoder_out
encoder_out[key] = self.encoders[key].reorder_encoder_out(encoder_out[key], new_order)
File "/juicier/scr121/scr/hughz/fairseq/fairseq/models/fconv_self_att.py", line 231, in reorder_encoder_out
eo.index_select(0, new_order) for eo in encoder_out['encoder_out']
File "/juicier/scr121/scr/hughz/fairseq/fairseq/models/fconv_self_att.py", line 231, in <genexpr>
eo.index_select(0, new_order) for eo in encoder_out['encoder_out']
RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.cuda.FloatTensor for argument #3 'index'```
Issue Analytics
- State:
- Created 5 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Can't load pretrained TrOCR model #16687 - GitHub
I have generated this pretrained model with. trainer = Seq2SeqTrainer(.....) trainer.train() trainer.save_model("./models").
Read more >How do pre-trained models work? - Towards Data Science
In most of my deep learning projects, I've used pre-trained models. ... Hence, it does not make sense to train them every time...
Read more >TensorFlow 2.x: Cannot save trained model in h5 format ...
The error OSError: Unable to create link (name already exists) when saving model in h5 format is caused by some duplicate variable names....
Read more >Exploring Pre-trained Model Use Cases with GPT-2 and T5
An example use case is generating a product reviews dataset to see which type of words are generally used in positive reviews versus...
Read more >Pre-trained models: Past, present and future - ScienceDirect
As Feynman's saying goes, “What I cannot create, I do not understand”. On one hand, a model that can not understand must not...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
upgrading to pytorch 0.4.1 solved the above problem.
I think the attached commit (#393) should also fix this on 0.4.0, right? The problem is that arange on 0.4.0 returns a FloatTensor, whereas from 0.4.1 forward it returns a LongTensor.