Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

accelerator.gather() at training time

See original GitHub issue

Can I use accelerator.gather() at training time? Would gradients be calculated properly? Basically my use case is something like below toy snippet. It seems that there is some issue with gradient flow in this scheme as my validation accuracy drops to 0.

model, optimizer, train_loader = accelerator.prepare(model, optimizer, train_loader)
for i, data in enumerate(train_loader):
    model.zero_grad()
    
    a, b = model(data)
    b_all = accelerator.gather(b)
    c = f(a, b_all)
    loss = criterion(a, b, c)
    accelerator.backward(loss)
    optimizer.step()

Issue Analytics

State:
Created 2 years ago
Reactions:2
Comments:8 (5 by maintainers)

Top GitHub Comments

1reaction

sguggercommented, May 10, 2021

Thanks for the tip! Using that it’s possible to add a gather_with_grad function that would work with training, leveraging the existing gather.

0reactions

github-actions[bot]commented, May 24, 2022

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.