Loading a model from PL 1.2 that was saved in PL 1.1 breaks
See original GitHub issue🐛 Bug
I saved a model trained with PL 1.1 from an environment with PL 1.2 and it breaks. There are some PL specific objects that get pickled into the checkpoint. This shouldn’t happen. See error below:
Traceback (most recent call last):
File "scripts/train_bart_seq2seq_augmented_kilt.py", line 45, in <module>
model = BartSeq2SeqAugmented(**vars(args))
File "/home/ndecao/modify-transformers-memory/src/models/bart_seq2seq_augmented_kilt.py", line 67, in __init__
self.model = BartSeq2Seq.load_from_checkpoint(self.hparams.model_checkpoint)
File "/home/ndecao/.anaconda3/envs/kilt37/lib/python3.7/site-packages/pytorch_lightning/core/saving.py", line 134, in load_from_checkpoint
checkpoint = pl_load(checkpoint_path, map_location=lambda storage, loc: storage)
File "/home/ndecao/.anaconda3/envs/kilt37/lib/python3.7/site-packages/pytorch_lightning/utilities/cloud_io.py", line 32, in load
return torch.load(f, map_location=map_location)
File "/home/ndecao/.anaconda3/envs/kilt37/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/home/ndecao/.anaconda3/envs/kilt37/lib/python3.7/site-packages/torch/serialization.py", line 853, in _load
result = unpickler.load()
AttributeError: Can't get attribute '_gpus_arg_default' on <module 'pytorch_lightning.utilities.argparse_utils'
Expected behavior
The model should load without any error.
Environment
The model was trained and saved using PL 1.1.6 and loaded from PL 1.2.1
- PyTorch Version (e.g., 1.0): 1.7.1
- OS (e.g., Linux): Linux
- Python version: 3.9
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (4 by maintainers)
Top Results From Across the Web
Diagnosing and Resolving Problems - Oracle Help Center
When a problem is detected, alerts are generated and the fault diagnosability infrastructure is activated to capture and store diagnostic data. The data...
Read more >Enterprise PL/I for z/OS Language Reference - IBM
This edition applies to Enterprise PL/I for z/OS, Version 5 Release 1 ... dynamic save area (register 13 on z/OS) and will make...
Read more >Bug listing with status RESOLVED with resolution FIXED as at ...
Bug:2 - "How do I attach an ebuild." status:RESOLVED resolution:FIXED severity:normal · Bug:3 - "poedit-1.1.5.ebuild" status:RESOLVED resolution:FIXED ...
Read more >PL/SQL Developer - Allround Automations
PL /SQL Developer is an Integrated Development Environment that is specifically targeted at the development of stored program units for Oracle Databases.
Read more >Changelog — PyTorch Lightning 1.8.5 documentation
Integrated the Lite Precision plugins into the PL Precision plugins - the base ... Removed duplicated file extension when uploading model checkpoints with ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
It’s because the args contain a function, and you pass it into the model which saves all args into the checkpoint (not your fault). Unpickling will not be possible outside PL environment (or when PL code changes). I believe I have a fix for this. #6898 @Borda do you have a suggestion, where is a good place to add a test for this?
Reproduces with:
@Borda I’ll try to reproduce using the BoringModel