Load optimizer status from checkpoint file.
See original GitHub issue🚀 Feature
In incremental training, we need to load optimizer status along with weights, and send to trainer to train it. But seems the optimizer is missing after load module from checkpoint file.
ckpt_path = checkpoint_callback.best_model_path
ckpt_model = MyModule.load_from_checkpoint(ckpt_path)
ckpt_model.optimziers() # it is be empty.
trainer.fit(ckpt_model) # it will start a fresh new optimizer instead of reusing the old optimizer status from checkpoint.
Motivation
In our pipeline, we load the checkpoint in CPU side (a spark cluster), and send model to GPU side (a Ray cluster) to do the remote training job. It works with Keras because when I load a compiled kearas model, the optimizer weight is part of it. So I can send this compiled model to GPU side to incrementally train it.
Pitch
When load the lightning module from checkpoint, optimizer status is a part of it, so I can pass the module to trainer to continue train it.
# load model checkpoint in CPU side
ckpt_path = checkpoint_callback.best_model_path
ckpt_model = MyModule.load_from_checkpoint(ckpt_path, include_optimizer=True)
ckpt_model.optimziers() # it should show a loaded optimizer from checkpoing
# send to GPU cluster for training
# ckpt_model = deserialize(serialize(ckpt_model))
# start train job in GPU cluster.
trainer.fit(ckpt_model) # it train with old optimizer status from checkpoint.
Alternatives
Additional context
If you enjoy Lightning, check out our other projects! ⚡
-
Metrics: Machine learning metrics for distributed, scalable PyTorch applications.
-
Lite: enables pure PyTorch users to scale their existing code on any kind of device while retaining full control over their own loops and optimization logic.
-
Flash: The fastest way to get a Lightning baseline! A collection of tasks for fast prototyping, baselining, fine-tuning, and solving problems with deep learning.
-
Bolts: Pretrained SOTA Deep Learning models, callbacks, and more for research and production with PyTorch Lightning and PyTorch.
-
Lightning Transformers: Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:5 (4 by maintainers)

Top Related StackOverflow Question
documentation for restoring training state instead of just model weights: https://pytorch-lightning.readthedocs.io/en/latest/common/checkpointing.html#restoring-training-state
I’ll highlight/make this more explicit in upcoming docs updates
The trainer will load the optimizer states from the checkpoint if you pass the checkpoint path as an argument to trainer.fit .
Does that work for you? @jjenniferdai