question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training crashes due to inplace operation

See original GitHub issue

Hi there, I am trying to train this model on my custom dataset and training starts but after many iterations, training crashes due to this error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 128, 64]], which is output 0 of ViewBackward, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

After enabling anomaly detection, here is the error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 128, 64]], which is output 0 of ViewBackward, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Traceback is not useful as it only shows error at line loss.backward() and torch.autograd module. Can you help me where the issue might be?

Thanks

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
lucidrainscommented, May 19, 2020

Woohoo!

1reaction
NaxAlphacommented, May 19, 2020

Thanks a lot. It’s working now!!

Read more comments on GitHub >

github_iconTop Results From Across the Web

dropout(inplace=True) raises error : one of the variables ...
Describe the bug Invoking the nn.Dropout(..., inplace=True) will make the training crash one of the variables needed for gradient computation has been ...
Read more >
RuntimeError: one of the variables needed for gradient ...
In my case, the issue was caused by the += notation. ... counts as an inplace operation, but normal variable assignment does not...
Read more >
VAE-GAN: INPLACE operation error - vision - PyTorch Forums
During training Autoencoder (generator) model I always get the same error: "RuntimeError: one of the variables needed for gradient computation ...
Read more >
Network Screening with Crash Data – Frequency
Often agencies train road maintenance staff on how to identify safety issues. This training, combined with their knowledge of the roadway network, qualifies ......
Read more >
Preventing work-related motor vehicle crashes - CDC
❚❏We provide training specific to the vehicle(s) that the worker is expected to operate. ❚❏We provide periodic “refresher” driver training. ❚❏We have ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found