question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problem loading pretrained stories model

See original GitHub issue

I am having the same issue as #209 (now closed). When I run generate.py on the model files in stories_checkpoint.tar.bz2, I get:

| [wp_source] dictionary: 19032 types
| [wp_target] dictionary: 112832 types
| data-bin/writingPrompts test 15138 examples
| loading model(s) from models/fusion_checkpoint.pt
| loading pretrained model
RuntimeError: Error(s) in loading state_dict for FConvModelSelfAtt:
        While copying the parameter named "encoder.encoder.embed_tokens.weight", whose dimensions in the model are torch.Size([19032, 256]) and whose dimensions in the checkpoint are torch.Size([19025, 256]).
        While copying the parameter named "decoder.embed_tokens.weight", whose dimensions in the model are torch.Size([112832, 256]) and whose dimensions in the checkpoint are torch.Size([104960, 256]).
        While copying the parameter named "decoder.fc3.weight", whose dimensions in the model are torch.Size([112832, 256]) and whose dimensions in the checkpoint are torch.Size([104960, 256]).
        While copying the parameter named "decoder.fc3.bias", whose dimensions in the model are torch.Size([112832]) and whose dimensions in the checkpoint are torch.Size([104960]).

I can see that it’s some discrepancy in the vocabulary sizes between the model and checkpoint that is causing the problem. However, I binarized the writingPrompts dataset by running preprocess.py exactly as specified in the example. Here’s the output of that script:

| [wp_source] Dictionary: 19031 types
| [wp_source] examples/stories/writingPrompts/train.wp_source: 272600 sents, 8008372 tokens, 1.36% replaced by <unk>
| [wp_source] Dictionary: 19031 types
| [wp_source] examples/stories/writingPrompts/valid.wp_source: 15620 sents, 469336 tokens, 2.1% replaced by <unk>
| [wp_source] Dictionary: 19031 types
| [wp_source] examples/stories/writingPrompts/test.wp_source: 15138 sents, 440659 tokens, 2.24% replaced by <unk>
| [wp_target] Dictionary: 112831 types
| [wp_target] examples/stories/writingPrompts/train.wp_target: 272600 sents, 184176859 tokens, 0.771% replaced by <unk>
| [wp_target] Dictionary: 112831 types
| [wp_target] examples/stories/writingPrompts/valid.wp_target: 15620 sents, 10496165 tokens, 0.888% replaced by <unk>
| [wp_target] Dictionary: 112831 types
| [wp_target] examples/stories/writingPrompts/test.wp_target: 15138 sents, 10244721 tokens, 0.889% replaced by <unk>

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
hmc-cs-mdrissicommented, Jul 17, 2018

To deal with the pad dictionary, there’s a preprocessing option --padding-factor that prior to that commit was effectively defaulting to 1 but now defaults to 8. If you explicitly pass in that option as 1 that should let you get the same vocab size.

1reaction
roemmelecommented, Jul 17, 2018

Ok, I did this and re-ran preprocess.py. The resulting vocab size for the target data matched the pretrained model, but my source vocab still had a few extra tokens. I eventually rolled back to a previous commit (745d5fbd7f640e1fd04f17981c4816659ad64c04) and re-ran preprocess.py in order to get the same source vocab as the pretrained model, so there seems to be some recent change that affected the vocab sizes. Once I did this I could run generate.py with the current commit.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Problem loading pretrained stories model · Issue #216
I am having the same issue as #209 (now closed). When I run generate.py on the model files in stories_checkpoint.tar.bz2, ...
Read more >
Problem loading pre-trained FinBERT model - python
Following the tutorial, I have created a directory called bert , downloaded the model pytorch_model.bin and config.json file and dropped them in ...
Read more >
Transfer Learning for Computer Vision Tutorial
The problem we're going to solve today is to train a model to classify ants and bees. ... Load a pretrained model and...
Read more >
Pretrained Deep Neural Networks - MATLAB & Simulink
Learn how to download and use pretrained convolutional neural networks for classification, transfer learning and feature extraction.
Read more >
How to Save and Load Your Keras Deep Learning Model
The model and weight data is loaded from the saved files, and a new model is created. It is important to compile the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found