question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is validation loss computed and output ?

See original GitHub issue

Thank you for your great work. I’d like to ask you a small question. While I can find evaluation scores such as mIoU, I cannot find validation loss anywhere (on tensorboard, standard output, log.json etc.).

  • Is validation loss not computed ?
  • Is it computed but not output by default (so can I output validation loss somehow by changing config ?)
  • Is it computed and output but do I simply miss it ?

I used the following config.

python tools/train.py  configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_80k_cityscapes.py 

I set

workflow = [('train', 10), ('val', 1)]
evaluation = dict(interval=2000, metric='mIoU'), 

where 1 epoch = 300 iterations.
Thanks for any help.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:2
  • Comments:12

github_iconTop GitHub Comments

5reactions
tetsu-kikuchicommented, Jan 10, 2021

Hello @rubeea Sorry for late reply. I have been crazily busy this week.

Thank you for your comment. Yes, I have changed the workflow to include val.
Unfortunately, I have not encountered a similar problem when setting [(‘train’, 1)].
[(‘train’, 1)] and [(‘train’, 1), (‘val’, 1)] both worked in my case.

For the original problem in this issue, that is, to output validation loss in the tensorboard, today I found a workaround.
In mmseg/models/segmentors/base.py, validation loss is calculated in def val_step.
https://github.com/open-mmlab/mmsegmentation/blob/master/mmseg/models/segmentors/base.py#L162

To show loss in tensorboard, we need the key ‘log_vars’ in the output dictionary. This key exists in train output (in def train_step), but not in val output. That is why the val loss is not shown in tensorboard, I suppose. So, I simply mimic the def train_step. I added the following after output = self(**data_batch, **kwargs) in def val_step.

    loss, log_vars = self._parse_losses(output)

    import collections
    log_vars_val = collections.OrderedDict()
    for k,v in log_vars.items():
        new_key = 'val_' + k
        log_vars_val[new_key] = v

    output = dict(
        loss=loss,
        log_vars=log_vars_val,
        num_samples=len(data_batch['img'].data))

I slightly changed the name by adding prefix ‘val_’ in the keys, otherwise I think the val loss is not distinguished from train loss in the tensorboard. In my case, this workaround worked and the val loss is shown on the tensorboard. (One unsatisfactory point is that the val loss is shown in ‘train’ tab… This is ugly but is not a problem practically.) I hope this helps.

1reaction
rubeeacommented, Dec 24, 2020

@rubeea Thank you for your reply. To be honest, I do not understand the detail of DeepLab and the meaning of each loss function.

So, do you mean that the validation loss is implicitly computed in the code, but it is not output anywhere on tensorboard, standard output, log.json etc ? Should I modify a little bit the code so that I can get the value of the validation loss ?

The two kinds of losses you mentioned (decode.loss_seg and aux.loss_seg) appear on tensorboard in the train tab, but I cannot find them in the validation tab. Only evaluation scores (aAcc, mAcc, and mIoU) appear in the validation tab. Possibly I am doing something stupid or misunderstanding something miserably.

I am still confused, but It seems that I should first learn the meaning of the losses and their implementation in the code. Thank you again.

Hi,

Actually you are right those are indeed the training data losses while the metrics are being computed on the validation dataset. Kindly report the solution here if you find a workaround. Thanks 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Validation loss - neural network - Data Science Stack Exchange
Validation loss is the same metric as training loss, but it is not used to update the weights. It is calculated in the...
Read more >
Training and Validation Loss in Deep Learning - Baeldung
The validation loss is similar to the training loss and is calculated from a sum of the errors for each example in the...
Read more >
Your validation loss is lower than your training loss? This is why!
Reason 3: Training loss is calculated during each epoch, but validation loss is calculated at the end of each epoch. Remember that each...
Read more >
Why is my validation loss lower than my training loss?
Your training loss is continually reported over the course of an entire epoch; however, validation metrics are computed over the validation set ......
Read more >
How to compute the validation loss? (Simple linear regression)
The validation loss is a flat line. It is not what I want. python · deep-learning · neural-network · pytorch · linear-regression.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found