Problem loading pretrained stories model
See original GitHub issueI am having the same issue as #209 (now closed). When I run generate.py on the model files in stories_checkpoint.tar.bz2, I get:
| [wp_source] dictionary: 19032 types
| [wp_target] dictionary: 112832 types
| data-bin/writingPrompts test 15138 examples
| loading model(s) from models/fusion_checkpoint.pt
| loading pretrained model
RuntimeError: Error(s) in loading state_dict for FConvModelSelfAtt:
While copying the parameter named "encoder.encoder.embed_tokens.weight", whose dimensions in the model are torch.Size([19032, 256]) and whose dimensions in the checkpoint are torch.Size([19025, 256]).
While copying the parameter named "decoder.embed_tokens.weight", whose dimensions in the model are torch.Size([112832, 256]) and whose dimensions in the checkpoint are torch.Size([104960, 256]).
While copying the parameter named "decoder.fc3.weight", whose dimensions in the model are torch.Size([112832, 256]) and whose dimensions in the checkpoint are torch.Size([104960, 256]).
While copying the parameter named "decoder.fc3.bias", whose dimensions in the model are torch.Size([112832]) and whose dimensions in the checkpoint are torch.Size([104960]).
I can see that it’s some discrepancy in the vocabulary sizes between the model and checkpoint that is causing the problem. However, I binarized the writingPrompts dataset by running preprocess.py exactly as specified in the example. Here’s the output of that script:
| [wp_source] Dictionary: 19031 types
| [wp_source] examples/stories/writingPrompts/train.wp_source: 272600 sents, 8008372 tokens, 1.36% replaced by <unk>
| [wp_source] Dictionary: 19031 types
| [wp_source] examples/stories/writingPrompts/valid.wp_source: 15620 sents, 469336 tokens, 2.1% replaced by <unk>
| [wp_source] Dictionary: 19031 types
| [wp_source] examples/stories/writingPrompts/test.wp_source: 15138 sents, 440659 tokens, 2.24% replaced by <unk>
| [wp_target] Dictionary: 112831 types
| [wp_target] examples/stories/writingPrompts/train.wp_target: 272600 sents, 184176859 tokens, 0.771% replaced by <unk>
| [wp_target] Dictionary: 112831 types
| [wp_target] examples/stories/writingPrompts/valid.wp_target: 15620 sents, 10496165 tokens, 0.888% replaced by <unk>
| [wp_target] Dictionary: 112831 types
| [wp_target] examples/stories/writingPrompts/test.wp_target: 15138 sents, 10244721 tokens, 0.889% replaced by <unk>
Issue Analytics
- State:
- Created 5 years ago
- Comments:8 (4 by maintainers)
Top Results From Across the Web
Problem loading pretrained stories model · Issue #216
I am having the same issue as #209 (now closed). When I run generate.py on the model files in stories_checkpoint.tar.bz2, ...
Read more >Problem loading pre-trained FinBERT model - python
Following the tutorial, I have created a directory called bert , downloaded the model pytorch_model.bin and config.json file and dropped them in ...
Read more >Transfer Learning for Computer Vision Tutorial
The problem we're going to solve today is to train a model to classify ants and bees. ... Load a pretrained model and...
Read more >Pretrained Deep Neural Networks - MATLAB & Simulink
Learn how to download and use pretrained convolutional neural networks for classification, transfer learning and feature extraction.
Read more >How to Save and Load Your Keras Deep Learning Model
The model and weight data is loaded from the saved files, and a new model is created. It is important to compile the...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
To deal with the pad dictionary, there’s a preprocessing option --padding-factor that prior to that commit was effectively defaulting to 1 but now defaults to 8. If you explicitly pass in that option as 1 that should let you get the same vocab size.
Ok, I did this and re-ran preprocess.py. The resulting vocab size for the target data matched the pretrained model, but my source vocab still had a few extra tokens. I eventually rolled back to a previous commit (745d5fbd7f640e1fd04f17981c4816659ad64c04) and re-ran preprocess.py in order to get the same source vocab as the pretrained model, so there seems to be some recent change that affected the vocab sizes. Once I did this I could run generate.py with the current commit.