Controlling the global step in TrainResult.log and EvalResult.log
See original GitHub issueI can find a way to control the global step of logging metrics to TrainResult using the log (or log_dict) functions? What is the proper way to use these functions so the logged metrics will show in tensorboard once per epoch? Currently, the steps showing on my tensorboard are (63*i - 1) for i=1 and so on. This is my training_step function (validation_step is similar using pl.EvalResult)
def training_step(self, batch, batch_idx) -> pl.TrainResult:
x, mask = batch
pred = self(x)
loss = self.loss_function(pred, mask)
result = pl.TrainResult(loss)
result.log("Trainer/cross_entropy_loss", self.cross_entropy_loss(pred, mask))
return result
I tried to set most of TrainResult.log
parameters manually (like on_epoch, logger, sync_dist, reduce_fx and so on)
- OS: Ubuntu 18.04.4 Nvidia driver version: 440.33.01 CUDA versions available: cuda-10.0 cuda-10.1 cuda-10.2 cuda-9.0 cuda-9.2 Default CUDA version is 10.0
- torch==1.5.1
- pytorch-lightning==0.9.0
Thank you a lot in advance
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (6 by maintainers)
Top Results From Across the Web
Controlling the global step in TrainResult.log and EvalResult.log
I can find a way to control the global step of logging metrics to TrainResult using the log (or log_dict) functions?
Read more >Log_save_interval and row_log_interval - Trainer - Lightning AI
I noticed logging in tensorboard is done at row_log_interval ... GH issue: Controlling the global step in TrainResult.log and EvalResult.log ...
Read more >[PyTorch Lightning] Log Training Losses when Accumulating ...
[PyTorch Lightning] Log Training Losses when Accumulating Gradients. The global step is not what you think it is.
Read more >synced BatchNorm, DataModules and final API! | by PyTorch ...
They are meant to control where and when to log and how synchronization is done ... TrainResult default is to log every step...
Read more >PyTorch-Lightning Documentation
The return object TrainResult controls where to log, when to log ... Note: Lightning saves all aspects of training (epoch, global step, etc....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @shalgi-beyond setting
trainer = pl.Trainer(gpus=1, log_save_interval=1, row_log_interval=1)
would do the trick. Since I am not quite familiar with tensorboard, I have created the question on the forum for you https://forums.pytorchlightning.ai/t/log-save-interval-and-row-log-interval/135This issue has been automatically marked as stale because it hasn’t had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!