Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Resume training and plot

See original GitHub issue

Hello,

I have a run that is terminated in the middle of Training and now I want to resume the training. I didn’t set resume=True in wandb.init() but saved the model separately using PyTorch. IS it possible to load and resume the previous plot?

Thank you

Issue Analytics

State:
Created 3 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

5reactions

vanpeltcommented, Dec 18, 2020

@kargarisaac to manually resume an existing run you should run wandb.init(id="YOUR_RUN_ID", project="YOUR_PROJECT", resume="must") then you can make calls to wandb.log that will append metrics to that run as documented here: https://docs.wandb.com/library/resuming

2reactions

vanpeltcommented, Jul 8, 2021

@nbortych the run_id is available in the url of the run itself, or from the overview page. I.E. https://wandb.ai/vanpelt/reproducibility/runs/3f87uku2/overview

You can find the run id in the last part of the “Run Path” attribute or from the url, 3f87uku2 in this case.

You can also access the id of the run programaticall in your script wandb.run.id

Top Results From Across the Web

Tensorboard resume training plot - pytorch - Stack Overflow

I figured out how to continue the training plot. While creating the summarywriter, we need to provide the same log_dir that we used...

Resume training and plot · Issue #1622 · wandb ... - GitHub

Hello,. I have a run that is terminated in the middle of Training and now I want to resume the training. I didn't...

Saving and Loading Your Model to Resume Training in PyTorch

A simple PyTorch tutorial on how to resuming training deep learning models.

Keras: Starting, stopping, and resuming training

To learn how to start, stop, and resume training with Keras, just keep reading! ... The training plot is overwritten upon each epoch...

Resume Training from Checkpoint Network - MathWorks

This example shows how to save checkpoint networks while training a deep learning network and resume training from a previously saved network.

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Resume training and plot

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

"Step must only increase in log calls" when adding W&B logger after some training

Allow `wandb.init` to set `wandb.run.dir`.