Adaptive Pixel Intensity Loss generated NaN values while training
See original GitHub issueWas training on custom human dataset. Batch Size = 8 No of training images = 3800
No of steps trained before showing error = 75
After 75th step It generated an error:
RuntimeError: Function 'UpsampleBilinear2DBackward1' returned nan values in its 0th output.
The model trained successfully when using BCE loss.
We even checked for NaN values using torch.autograd.set_detect_anamoly(True)
But it returned False stating that no NaN values were found
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (4 by maintainers)
Top Results From Across the Web
Common Causes of NANs During Training
Common Causes of NANs During Training · Gradient blow up · Bad learning rate policy and params · Faulty Loss function · Faulty...
Read more >NaN loss when training regression network - Stack Overflow
In my case, I use the log value of density estimation as an input. The absolute value could be very huge, which may...
Read more >What should I do when my neural network doesn't learn?
NA or NaN or Inf values in your data creating NA or NaN or Inf values in the output, and therefore in the...
Read more >NAN loss for regression while training · Issue #2134 - GitHub
I'm running a regression model on patches of size 32x32 extracted from images against a real value as the target value.
Read more >Debugging a Machine Learning model written in TensorFlow ...
I wrote up a convnet model borrowing liberally from the training loop of the ... NaN loss. Now, when I ran it though,...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I also curious about which parts make this issue. I released the version of TRACER without edge generation. Replace the released scripts with the existing ones. I briefly tested it so if there is any problem, please let me know.
Thanks.
Thanks, Let me check this out