Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

AcceleratedOptimizer `zero_grad` argument not supported: `set_to_none`

See original GitHub issue

Currently the AcceleratedOptimizer class doesn’t support the argument set_to_none, is this an intentional exclusion?

Issue Analytics

State:
Created 2 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

sguggercommented, Apr 23, 2021

This should be fixed by the PR mentioned above and you can test with a source install. Don’t hesitate to reopen the issue if you still have trouble with this!

0reactions

vblagojecommented, Oct 20, 2021

I raised it https://github.com/NVIDIA/apex/issues/1196 I’ve seen also a few other LAMB optimizers out there and they simply inherit from PyTorch Optimizer and don’t implement zero_grad method. Therefore I doubt this problem is widespread.

Top Results From Across the Web

Why do we need to call zero_grad() in PyTorch?

As of v1.7.0, Pytorch offers the option to reset the gradients to None optimizer.zero_grad(set_to_none=True) instead of filling them with a tensor of zeroes ......

torch.optim.Optimizer.zero_grad — PyTorch 1.13 documentation

If the user requests zero_grad(set_to_none=True) followed by a backward pass, .grad s are guaranteed to be None for params that did not receive...

1.3.5 PDF - PyTorch Lightning Documentation

In this guide we'll show you how to organize your PyTorch code into Lightning in 2 steps. Organizing your code with PyTorch Lightning...

Chainer Documentation

One is manually computing gradients and then calling the update() method with no arguments. Do not forget to clear the gradients beforehand! > ......

Chainer Documentation - Read the Docs

with no arguments. Do not forget resetting gradients beforehand! >>> model.zerograds(). >>> # compute gradient here... >>> optimizer.update ...