question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

wandb.log delayed

See original GitHub issue

I am running wandb.log({"loss":loss}, step=epoch) at the end of every epoch. However, the logs are delayed 1-2 epochs in each iteration. This means that I am seeing the logs for the first epoch only on the platform when the second or third epoch is finished. Unfortunately, after stopping the code the last epochs are then not in the logs and are missing.

Any ideas?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:5
  • Comments:20 (11 by maintainers)

github_iconTop GitHub Comments

2reactions
vanpeltcommented, Oct 11, 2019

Hey @psinger @vadapalliravikumar the long term fix is a bit more involved but we better understand the underlying issue. The problem is we use the step argument to decide when to push metrics to the server (only when it increases). Unfortunately in jupyter we don’t handle the final step properly and this also creates a delay in metrics being pushed until the step increases. The quick fix is to not specify step and instead use the commit argument if you need to rollup metrics in a single step. i.e.

wandb.log({"acc": 0.5}, commit=False)
# Some other code
wandb.log({"val_acc": 0.6, "custom_step": 32})

Every time wandb.log is called without a step argument or with commit=False we increment an internal step counter and push the metrics to the server. If you want to keep track of a different step counter you can pass it into log just like any other metric. We then allow you to change the x-axis in the UI to any metrics you’ve logged that are monotonically increasing.

I know this is unfortunate. We’re working on a big overhaul of our backend that should address these issues and make for a cleaner api. And again, apologies for not getting back to you sooner on this one.

1reaction
vanpeltcommented, May 3, 2020

Please try this again with 0.8.35 we made some big changes to the Jupyter integration that *should fix your issues.

Read more comments on GitHub >

github_iconTop Results From Across the Web

wandb.log delayed · Issue #554 - GitHub
I am running wandb.log({"loss":loss}, step=epoch) at the end of every epoch. However, the logs are delayed 1-2 epochs in each iteration.
Read more >
How often to log to avoid slow down of code? - #2 by charlesfrye
The hard work of wandb. log runs in a different process, so that it doesn't always slow down your code. The rough guideline...
Read more >
Wandb first run start time is delayed - Stack Overflow
I wanted to compare the execution speeds of three data types. The runs were organized in sequence of Original , DictList , DataFrame...
Read more >
TensorBoard · GitBook
TensorBoard. W&B supports patching TensorBoard to automatically log all the metrics from your script into our native charts. import wandb ...
Read more >
Weights & Biases (with Dask Cluster) - Saturn Cloud
If when running the wandb.login() command you are asked to provide your Weights & Biases API key then you did not correctly set...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found