Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Controlling the global step in TrainResult.log and EvalResult.log

See original GitHub issue

I can find a way to control the global step of logging metrics to TrainResult using the log (or log_dict) functions? What is the proper way to use these functions so the logged metrics will show in tensorboard once per epoch? Currently, the steps showing on my tensorboard are (63*i - 1) for i=1 and so on. This is my training_step function (validation_step is similar using pl.EvalResult)

def training_step(self, batch, batch_idx) -> pl.TrainResult:
    x, mask = batch
    pred = self(x)
    loss = self.loss_function(pred, mask)
    result = pl.TrainResult(loss)
    result.log("Trainer/cross_entropy_loss", self.cross_entropy_loss(pred, mask))
    return result

I tried to set most of TrainResult.log parameters manually (like on_epoch, logger, sync_dist, reduce_fx and so on)

OS: Ubuntu 18.04.4 Nvidia driver version: 440.33.01 CUDA versions available: cuda-10.0 cuda-10.1 cuda-10.2 cuda-9.0 cuda-9.2 Default CUDA version is 10.0
torch==1.5.1
pytorch-lightning==0.9.0

Thank you a lot in advance

Issue Analytics

State:
Created 3 years ago
Comments:11 (6 by maintainers)

Top GitHub Comments

1reaction

ydcjeffcommented, Sep 3, 2020

Hi @shalgi-beyond setting trainer = pl.Trainer(gpus=1, log_save_interval=1, row_log_interval=1) would do the trick. Since I am not quite familiar with tensorboard, I have created the question on the forum for you https://forums.pytorchlightning.ai/t/log-save-interval-and-row-log-interval/135

0reactions

stale[bot]commented, Oct 21, 2020

This issue has been automatically marked as stale because it hasn’t had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!