question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

WandB-Logger drops all the logged values in training step for PyTorch Lightning

See original GitHub issue

train-step:

def training_step(self, batch, batch_idx):
            x, y = batch
            logits, masks = self(x)
            loss = self.loss_func(logits, y)
            self.log(f"train-loss", loss, prog_bar=True)
            return {
                "loss": loss,
                "pred": torch.argmax(logits, -1),
                "labels": y,
            }

and on epoch end it logs the Precision and recall. after the validation step. I get a wandb: WARNING Step must only increase in log calls. Step 379 < 380; dropping {train-loss: 0.494, 'train-precision': 0.9, 'train-recall': 1, 'epoch': 1}

WandB version: 0.10.10 PyTorch-Lightning: 1.0.6

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:23 (13 by maintainers)

github_iconTop GitHub Comments

9reactions
isaacrobcommented, Nov 20, 2020

Fixed it! I was calling trainer.logger.experiment.log in a callback to log Matplotlib images and apparently this increases the step unless commit is set to False, in which case it gets group with the next logging message sent via PyTorch Lightning. Subtle bug! Everything’s ok now and working as expected 😃 thanks for your help @borisdayma. Hope @IsCoelacanth was able to resolve their issue too!

3reactions
borisdaymacommented, Mar 8, 2021

This was fixed with https://github.com/PyTorchLightning/pytorch-lightning/pull/5931. You can now just log at any time with self.log('my_metric', my_value) and you won’t have any dropped value. Just choose your x-axis appropriately in the UI (whether global_step or just the auto-incremented step).

Read more comments on GitHub >

github_iconTop Results From Across the Web

WandB-Logger drops all the logged values in training step for ...
I have this issue too! PyTorch Lightning resets step to 0 at the start of a new epoch, but wandb expects step to...
Read more >
WandbLogger — PyTorch Lightning 1.8.5.post0 documentation
A new W&B run will be created when training starts if you have not created one manually before with wandb.init() . Log metrics....
Read more >
Pytorch_Lightning + Wandb Starter - Kaggle
It incorporates pytorch lightning and also includes a very useful logger called wandb. pytorch lightning was made with reference to the following notebook....
Read more >
PyTorch Lightning - Documentation - Weights & Biases - WandB
PyTorch Lightning has a WandbLogger class that can be used to seamlessly log metrics, model weights, media and more. Just instantiate the WandbLogger...
Read more >
Lightning 1.6: Habana Accelerator, Bagua Distributed, Fault ...
Learn more about what's new in PyTorch Lightning 1.6, ... Saved checkpoints that use the global step value as part of the filename...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found