Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Same distribution, nonzero loss?

See original GitHub issue

Hi, I observed that my model is failing to converge. I am trying to debug the code and I am observing this peculiar behavior:

torch.sum(-F.softmax(student_out[0][0])*F.log_softmax(student_out[0][0], -1), -1)

returns 6.9058

Shouldn’t this theoretically return 0 since both are the same distribution?

Issue Analytics

State:
Created 2 years ago
Comments:5

Top GitHub Comments

1reaction

kyuheejocommented, Jan 11, 2022

I see what you mean, it is indeed cross entropy loss which is the loss used for DINO pertaining. What I don’t understand is how, in the logs provided by the author, can the cross entropy loss go as low as 2~3 when H(p,q) = H(p) + KL(p,q).

0reactions

woctezumacommented, Jan 11, 2022

I see. I have found the piece of code for the “DINO loss” (cross-entropy as you mentioned):

https://github.com/facebookresearch/dino/blob/cb711401860da580817918b9167ed73e3eef3dcf/main_dino.py#L380-L390

https://github.com/facebookresearch/dino/blob/cb711401860da580817918b9167ed73e3eef3dcf/main_dino.py#L392-L402

Top Results From Across the Web

Loss functions for specific probability distributions?

There is indeed a paper titled Loss Distributions that provides the limited expected value functions L(x) for several probability distributions (on page 15) ......

Candidate continuous distributions for non-zero loss amounts ...

We propose a new procedure to predict the loss given default (LGD) distribution. Studies find empirical evidence that LGD values have a high...

loss(y, y) != 0 (same labels and predictions, non-zero loss)

This shows why it is not zero, because you take the log of the prediction and multiply it by the label and take...

Distribution-based loss functions for deep learning models

Distribution -based loss functions for deep learning models. An overview on cross-entropy and its variants for measuring classification losses, ...

List of probability distributions - Wikipedia

Contents · 4.1 Two or more random variables on the same sample space · 4.2 Distributions of matrix-valued random variables.