Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

wandb.log delayed

See original GitHub issue

I am running wandb.log({"loss":loss}, step=epoch) at the end of every epoch. However, the logs are delayed 1-2 epochs in each iteration. This means that I am seeing the logs for the first epoch only on the platform when the second or third epoch is finished. Unfortunately, after stopping the code the last epochs are then not in the logs and are missing.

Any ideas?

Issue Analytics

State:
Created 4 years ago
Reactions:5
Comments:20 (11 by maintainers)

Top GitHub Comments

2reactions

vanpeltcommented, Oct 11, 2019

Hey @psinger @vadapalliravikumar the long term fix is a bit more involved but we better understand the underlying issue. The problem is we use the step argument to decide when to push metrics to the server (only when it increases). Unfortunately in jupyter we don’t handle the final step properly and this also creates a delay in metrics being pushed until the step increases. The quick fix is to not specify step and instead use the commit argument if you need to rollup metrics in a single step. i.e.

wandb.log({"acc": 0.5}, commit=False)
# Some other code
wandb.log({"val_acc": 0.6, "custom_step": 32})

Every time wandb.log is called without a step argument or with commit=False we increment an internal step counter and push the metrics to the server. If you want to keep track of a different step counter you can pass it into log just like any other metric. We then allow you to change the x-axis in the UI to any metrics you’ve logged that are monotonically increasing.

I know this is unfortunate. We’re working on a big overhaul of our backend that should address these issues and make for a cleaner api. And again, apologies for not getting back to you sooner on this one.

1reaction

vanpeltcommented, May 3, 2020

Please try this again with 0.8.35 we made some big changes to the Jupyter integration that *should fix your issues.