question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

training_epoch needs to return a "loss" key in the dict

See original GitHub issue

📚 Documentation

Hi everyone!

In the docs detailing the usage of the logging functions training_epoch_end, if "loss": loss is not explicitly passed as a return value, then the code will fail.

The docs at https://pytorch-lightning.readthedocs.io/en/latest/experiment_reporting.html#log-metrics is not correct,

def training_epoch_end(self, outputs):
   loss = some_loss()
   ...

   logs = {'train_loss': loss}
   results = {'log': logs}
   return results

shall be changed to

def training_epoch_end(self, outputs):
   loss = some_loss()
   ...

   logs = {'train_loss': loss}
   results = {'loss': loss, 'log': logs} # <------------------------ Here is the change
   return results

Use case: I want to monitor the training metrics to check whether my network is able to overfit the data (for this functionality, refer to #1076 )

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
rpatrik96commented, Mar 21, 2020

Oh, you are right, I was using the old version; interestingly, I got no warning even on a server using the newest lightning version. Is it any way to enable explicit warnings about to be deprecated features. I like to maintain my code to be possibly up to date - although, it does not happen all the time, as in this case. Anyway, I am closing this issue.

0reactions
awaelchlicommented, Mar 21, 2020

training_epoch_end does not exist yet, right (#1076)? When you say the code fails, you probably used the old training_end which was on batches and now is renamed to training_step_end. There the loss key is required.

I don’t see why the loss key should be mandatory in training_epoch_end. As far as I know, validation_epoch_end also does not require a loss key in return dict.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to return history of validation loss in Keras - Stack Overflow
Just an example started from history = model.fit(X, Y, validation_split=0.33, nb_epoch=150, batch_size=10, verbose=0). You can use
Read more >
Display Deep Learning Model Training History in Keras
Access Model Training History in Keras​​ It records training metrics for each epoch. This includes the loss and the accuracy (for classification ...
Read more >
mne.Epochs — MNE 1.3.dev0 documentation - MNE-Python
The dictionary keys correspond to the different channel types; valid keys can be any channel ... Return an Epochs object with a copied...
Read more >
Keras: Starting, stopping, and resuming training
In this tutorial, you will learn how to use Keras to train a neural network, stop training, update your learning rate, and then...
Read more >
LightningModule - PyTorch Lightning - Read the Docs
Called at the end of the training epoch with the outputs of all training steps. Use this in case you need to do...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found