Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Training crashes due to inplace operation

See original GitHub issue

Hi there, I am trying to train this model on my custom dataset and training starts but after many iterations, training crashes due to this error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 128, 64]], which is output 0 of ViewBackward, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

After enabling anomaly detection, here is the error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 128, 64]], which is output 0 of ViewBackward, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

Traceback is not useful as it only shows error at line loss.backward() and torch.autograd module. Can you help me where the issue might be?

Thanks

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

lucidrainscommented, May 19, 2020

Woohoo!

1reaction

NaxAlphacommented, May 19, 2020

Thanks a lot. It’s working now!!

Top Results From Across the Web

dropout(inplace=True) raises error : one of the variables ...

Describe the bug Invoking the nn.Dropout(..., inplace=True) will make the training crash one of the variables needed for gradient computation has been ...

RuntimeError: one of the variables needed for gradient ...

In my case, the issue was caused by the += notation. ... counts as an inplace operation, but normal variable assignment does not...

VAE-GAN: INPLACE operation error - vision - PyTorch Forums

During training Autoencoder (generator) model I always get the same error: "RuntimeError: one of the variables needed for gradient computation ...

Network Screening with Crash Data – Frequency

Often agencies train road maintenance staff on how to identify safety issues. This training, combined with their knowledge of the roadway network, qualifies ......

Preventing work-related motor vehicle crashes - CDC

❚❏We provide training specific to the vehicle(s) that the worker is expected to operate. ❚❏We provide periodic “refresher” driver training. ❚❏We have ...