LogSumExp seems to be missing in SoftmaxCrossEntropyLoss (sparseLabel=false)
See original GitHub issueDescription
On latest MasterBranch (commit 4b516196) in SoftmaxCrossEntropyLoss (sparseLabel=false) line 85
loss = pred.mul(lab).neg().sum(new int[] {classAxis}, true);
looks like the LogSumExp term (red) is missing.
Proposed correction
int[] axes = new int[] {classAxis};
NDArray max = pred.max(axes, true);
NDArray logSumExp = max.add((pred.sub(max)).exp().sum(axes, true).log());
loss = logSumExp.sub(pred.mul(lab).sum(axes, true));
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (9 by maintainers)
Top Results From Across the Web
LogSumExp seems to be missing in ... - GitHub
LogSumExp seems to be missing in SoftmaxCrossEntropyLoss (sparseLabel=false) #520 ... looks like the LogSumExp term (red) is missing.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @roywei, thank you for looking into the issue. I suggest to reopen the issue to correct the “sparseLabel=false” part of the loss function.
Your unit test investigates the method “evaluate” in SoftmaxCrossEntropyLoss in a different context
and does not test the problematic code part I was dealing with.
My context is
As you suggested let’s look at an example from the python apis as reference for a unit test:
gives out
The unitTest with the current DJL code fails:
The unit test for my corrected code passes:
I have adjusted the loss function in the following way
… forget about my last comment on the gradient. Didn’t take acount the “mean” which makes the factor 2 in the gradient. Thanks again and sorry for the “gradient noise”.