question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

load_from_checkpoint: checkpoint[ 'module_arguments'] KeyError

See original GitHub issue

After training, I load my best checkpoint and run trainer.test. This fails with the following error in v 0.76. Have people encountered this before? My unit tests, which don’t call finetune.py through the command line, do not encounter this issue.

Thanks in advance! Happy to make a reproducible example if this is a new/unknown bug.

    model = model.load_from_checkpoint(checkpoints[-1])
  File "/home/shleifer/miniconda3/envs/nb/lib/python3.7/site-packages/pytorch_lightni
ng/core/lightning.py", line 1563, in load_from_checkpoint
    checkpoint[CHECKPOINT_KEY_MODULE_ARGS].update(kwargs)

=>

KeyError: 'module_arguments'

model is a pl.Module checkpoints[-1] was saved by it, with the save_weights_only=True kwarg specified.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
wolterlwcommented, Sep 22, 2020

still having issues when loading a checkpoint When I manually examine the checkpoint saved by lightning it only contains following keys:

[‘epoch’, ‘global_step’, ‘pytorch-lightning_version’, ‘checkpoint_callback_best_model_score’, ‘checkpoint_callback_best_model_path’, ‘optimizer_states’, ‘lr_schedulers’, ‘state_dict’]

so when I try using Module.load_from_checkpoint it fails because the parameters are not present. OmegaConf is used to instantiate the module like this: lm = Module(**config.lightning_module_conf)

pytorch_lightning version 0.9.0

1reaction
wolterlwcommented, Oct 30, 2020

@FluidSense did you call self.save_hyperparameters() in LightningModule’s __init__?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unable to load model from checkpoint in Pytorch-Lightning
Cause. This happens because your model is unable to load hyperparameters(n_channels, n_classes=5) from the checkpoint as you do not save ...
Read more >
Hparams not restored when using load_from_checkpoint ...
Problem I'm having an issue where the model is training fine, and the saved checkpoint does indeed have the hparams used in training....
Read more >
[solved] KeyError: 'unexpected key "module.encoder ...
I am getting the following error while trying to load a saved model. KeyError: 'unexpected key "module.encoder.embedding.weight" in state_dict' This is the ...
Read more >
Saving and Loading Models — PyTorch Tutorials 1.0.0 ...
Contents: What is a state_dict? Saving & Loading Model for Inference; Saving & Loading a General Checkpoint; Saving Multiple Models in One File;...
Read more >
Model Checkpointing — DeepSpeed 0.8.0 documentation
DeepSpeed provides routines for checkpointing model state during training. Loading Training Checkpoints¶. deepspeed.DeepSpeedEngine.load_checkpoint(self ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found