AcceleratedOptimizer `zero_grad` argument not supported: `set_to_none`
See original GitHub issueCurrently the AcceleratedOptimizer
class doesn’t support the argument set_to_none
, is this an intentional exclusion?
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Why do we need to call zero_grad() in PyTorch?
As of v1.7.0, Pytorch offers the option to reset the gradients to None optimizer.zero_grad(set_to_none=True) instead of filling them with a tensor of zeroes ......
Read more >torch.optim.Optimizer.zero_grad — PyTorch 1.13 documentation
If the user requests zero_grad(set_to_none=True) followed by a backward pass, .grad s are guaranteed to be None for params that did not receive...
Read more >1.3.5 PDF - PyTorch Lightning Documentation
In this guide we'll show you how to organize your PyTorch code into Lightning in 2 steps. Organizing your code with PyTorch Lightning...
Read more >Chainer Documentation
One is manually computing gradients and then calling the update() method with no arguments. Do not forget to clear the gradients beforehand! > ......
Read more >Chainer Documentation - Read the Docs
with no arguments. Do not forget resetting gradients beforehand! >>> model.zerograds(). >>> # compute gradient here... >>> optimizer.update ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
This should be fixed by the PR mentioned above and you can test with a source install. Don’t hesitate to reopen the issue if you still have trouble with this!
I raised it https://github.com/NVIDIA/apex/issues/1196 I’ve seen also a few other LAMB optimizers out there and they simply inherit from PyTorch Optimizer and don’t implement zero_grad method. Therefore I doubt this problem is widespread.