Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

MriModule Validation Loss Logged Incorrectly

See original GitHub issue

Steps to reproduce:

Run “train_unet_demo.py” in fastmri/fastmri_examples/unet
- Preferably with a specified data folder that only has a single data point for train and val, because it is going to error after a full training and validation epoch
Observe the following error:

pytorch_lightning.utilities.exceptions.MisconfigurationException: ModelCheckpoint(monitor='val_loss') not found in the returned metrics: ['loss', 'validation_loss', 'val_metrics/nmse', 'val_metrics/ssim', 'val_metrics/psnr']. HINT: Did you call self.log('val_loss', tensor) in the LightningModule?

The issue appears to be because MriModule was changed to log 'validation_loss' instead of 'val_loss' in the method validation_epoch_end. Renaming it back to 'val_loss' fixes the issue. The change happened in this commit.

Issue Analytics

State:
Created 3 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

mmuckleycommented, Nov 17, 2020

Okay it’s merged.

1reaction

zmhhcommented, Nov 17, 2020

Hello, yes that fixed the issue. Thanks!

Top Results From Across the Web

Why my training and validation loss is not changing?

Your weights have diverged during training, and the network as a result is essentially broken. As it consists of ReLUs, I expect the...

What does it mean when train and validation loss diverge from ...

Possible explanations. Coding error; Overfitting due to differences in the training / validation data; Skewed classes (and differences in ...

Your validation loss is lower than your training loss? This is why!

During validation and testing, your loss function only comprises prediction error, resulting in a generally lower loss than the training set. Image by...

Why is my validation loss lower than my training loss?

Reason #2: Training loss is measured during each epoch while validation loss is measured after each epoch.

Overfit and underfit | TensorFlow Core

TensorBoard to generate TensorBoard logs for the training. ... If the validation metric is going in the wrong direction, the model is clearly...