question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MriModule Validation Loss Logged Incorrectly

See original GitHub issue

Steps to reproduce:

  • Run “train_unet_demo.py” in fastmri/fastmri_examples/unet
    • Preferably with a specified data folder that only has a single data point for train and val, because it is going to error after a full training and validation epoch
  • Observe the following error:

pytorch_lightning.utilities.exceptions.MisconfigurationException: ModelCheckpoint(monitor='val_loss') not found in the returned metrics: ['loss', 'validation_loss', 'val_metrics/nmse', 'val_metrics/ssim', 'val_metrics/psnr']. HINT: Did you call self.log('val_loss', tensor) in the LightningModule?

The issue appears to be because MriModule was changed to log 'validation_loss' instead of 'val_loss' in the method validation_epoch_end. Renaming it back to 'val_loss' fixes the issue. The change happened in this commit.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
mmuckleycommented, Nov 17, 2020

Okay it’s merged.

1reaction
zmhhcommented, Nov 17, 2020

Hello, yes that fixed the issue. Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why my training and validation loss is not changing?
Your weights have diverged during training, and the network as a result is essentially broken. As it consists of ReLUs, I expect the...
Read more >
What does it mean when train and validation loss diverge from ...
Possible explanations. Coding error; Overfitting due to differences in the training / validation data; Skewed classes (and differences in ...
Read more >
Your validation loss is lower than your training loss? This is why!
During validation and testing, your loss function only comprises prediction error, resulting in a generally lower loss than the training set. Image by...
Read more >
Why is my validation loss lower than my training loss?
Reason #2: Training loss is measured during each epoch while validation loss is measured after each epoch.
Read more >
Overfit and underfit | TensorFlow Core
TensorBoard to generate TensorBoard logs for the training. ... If the validation metric is going in the wrong direction, the model is clearly...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found