How to save checkpoints within lightning_logs?
See original GitHub issueIβm currently doing checkpointing as follows:
checkpoint_callback = pl.callbacks.ModelCheckpoint(
filepath=os.path.join(os.getcwd(), 'checkpoints/{epoch}-{val_loss:.2f}'),
verbose=True,
monitor='val_loss',
mode='min',
save_top_k=-1,
period=1
)
trainer = pl.Trainer(
default_save_path=os.path.join(os.getcwd(), 'log_files_are_stored_here'),
gpus=1,
max_epochs=2
checkpoint_callback=checkpoint_callback
)
This creates the following folder structure:
βββ checkpoints # all the .pth files are saved here
βββ log_files_are_stored_here
βββ lightning_logs
βββ version_0
βββ version_1
βββ version_2
How can I get the .pth files for each version to be saved in the respective version folders like so?:
βββ log_files_are_stored_here
βββ lightning_logs
βββ version_0
βββ checkpoints # save the .pth files here
βββ version_1
βββ checkpoints # save the .pth files here
βββ version_2
βββ checkpoints # save the .pth files here
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (1 by maintainers)
Top Results From Across the Web
How to save checkpoints within lightning_logs? #1207 - GitHub
I'm currently doing checkpointing as follows: checkpoint_callback = pl.callbacks.ModelCheckpoint( filepath=os.path.join(os.getcwd(),Β ...
Read more >Checkpointing β PyTorch Lightning 1.8.5.post0 documentation
Learn to save and load checkpoints ... Advanced. Enable cloud-based checkpointing and composable checkpoints. advanced ... Dig into the ModelCheckpoint API.
Read more >Don't save lightning logs in Pytorch Lightning - Stack Overflow
You can disable checkpoint using the Trainer option enable_checkpointing : trainer = Trainer(enable_checkpointing=False).
Read more >Using PyTorch Lightning with Tune β Ray 1.11.0
Adding checkpoints to the PyTorch Lightning moduleΒΆ. First, we need to introduce another callback to save model checkpoints. Since Tune requires a call...
Read more >TiDB Lightning Glossary - PingCAP Docs
This page explains the special terms used in TiDB Lightning's logs, monitoring, configurations, and documentation.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi,
this is how I do it:
Hi, the TensorBoard version inspired by @chris-clem snippet.
Any idea how to get rid of the βHACKβ?