question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Attempting to log metrics per epoch returns a "step must increase" warning.

See original GitHub issue

Describe the bug Apologies if the title is a bit ambiguous. I’m also not entirely sure if this is a “bug,” but am hoping that someone can help out. Currently I’m training and evaluating my model with two loops, one outer loop for epochs and one inner loop for the batched data.

I want to log separate plots for my metrics per epoch (e.g., one line plot for epoch 1, another for epoch 2, etc.) and I set the step keyword argument in wandb.log to the step in my enumerate statement. However, doing so gave me a warning:

wandb: WARNING Step must only increase in log calls.  Step x < y; dropping {'loss': something}.

I’m assuming this is because step gets reset to 0 every loop, and also that each plot is only meant to refer to one W&B run. I’m wondering if I can bypass that behavior and make it so that each epoch would refer to one plot. I’ve taken a look a the documentation for Incremental Logging but my problem wasn’t solved.

To Reproduce You can run the following script as is and should get the problem.

import torch
import torch.nn as nn
import torch.optim as optim
import wandb


class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear1 = nn.Linear(in_features=400, out_features=100)
        self.output_linear = nn.Linear(in_features=100, out_features=10)

    def forward(self, x):
        output1 = self.linear1(x)
        output2 = self.output_linear(output1)

        return output2


def main():
    model = Model()
    input_data = torch.randn(size=(3, 300, 400))
    targets = torch.empty(300, dtype=torch.long).random_(3)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(params=model.parameters())

    for epoch in range(3):
        for step, batch in enumerate(input_data):
            optimizer.zero_grad()
            output = model(batch)
            loss = criterion(output, targets)
            loss.backward()
            optimizer.step()

            # Removing the step assignment eliminates the warning.
            wandb.log({'loss': loss.item()}, step=step)


if __name__ == '__main__':
    wandb.init(project='testing-wandb')
    main()

Operating System

  • OS: Ubuntu 16.04
  • Browser: None.
  • Version: 0.10.12

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

7reactions
cvphelpscommented, Dec 9, 2020

Hi @seanswyi, thanks for reaching out. We recommend not setting step here, because it needs to always increase. Why do you want a separate chart for each epoch? Most of our users plot a time series and have this all on a single chart.

If you want different metrics for each epoch, name your metrics loss_epoch_1, loss_epoch_2, etc.

If you want to plot your metrics against different x-axes, you can log the step as a metric, like wandb.log({'loss': 0.1, 'epoch': 1, 'batch': 3}). In the UI you can switch between x-axes in the chart settings.

2reactions
tyomhakcommented, Dec 9, 2020

Hey @seanswyi , thanks for writing in! The step you are setting in wandb.log is used internally, to track the history of the logs done. This means that each consecutive step must not be smaller than the previous one.

For your case I would suggest using a separate metric (lets call it ‘custom_step’) for tracking the step, and pass it as a common metric to the wandb.log function. Something like this should work for you:

for epoch in range(3):
        for custom_step, batch in enumerate(input_data):
            optimizer.zero_grad()
            output = model(batch)
            loss = criterion(output, targets)
            loss.backward()
            optimizer.step()
            
            wandb.log({'loss': loss.item(), 'custom_step':custom_step})

After this, you could change the x-axis metric for the charts from ‘step’ to ‘custom_step’ in your workspace UI.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Attempting to log metrics per epoch returns a "step must ...
log to the step in my enumerate statement. However, doing so gave me a warning: wandb: WARNING Step must only increase in log...
Read more >
Log Data with wandb.log - Documentation - Weights & Biases
Call wandb.log(dict) to log a dictionary of metrics, media, or custom objects to a step. Each time you log, we increment the step...
Read more >
pytorch_lightning.core.module - PyTorch Lightning
Returns : A single optimizer, or a list of optimizers in case multiple ones are ... for details. on_epoch: if ``True`` logs epoch...
Read more >
keras.callbacks — pysparkdl documentation
Returns : Instance of CallbackList used to control all Callbacks. ... Args: epoch: Integer, index of epoch. logs: Dict, metric results for this...
Read more >
Log metrics, parameters and files with MLflow - Microsoft Learn
Enable logging on your ML training runs to monitor real-time run metrics with MLflow, and to help diagnose errors and warnings.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found