Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tensors left behind on CPU in DataParallel Implementation

See original GitHub issue

I am encountering an issue with tensors being left behind on the CPU, triggering an assertion error as follows: AssertionError: Gather function not implemented for CPU tensors. This is happening within the process_sample function.

I can confirm that this is strictly associated with DataParallel since passing only a single CUDA device suppresses the error.

Issue Analytics

State:
Created 2 years ago
Comments:5

Top GitHub Comments

1reaction

anmatakocommented, Dec 30, 2021

@yyeboah I created PR #62 to address the empty tensors potentially being created on the CPU and this could be the cause of the issue you’re seeing. If you can grab the changes from that PR and test them out that would be great as I don’t have a multi-GPU setup readily available to test this; it works of course on my single GPU machine.

0reactions

yyeboahcommented, Dec 31, 2021

@anmatako I have retrieved PR #62 and after testing, I can confirm that indeed this issue has resolved. Cheers, and have a happy New year !

Top Results From Across the Web

DataParallel does not work with tensors of dimension 0 #9811

Issue description. I have a network that return a single value, which is a dimensionless tensor as of PyTorch 0.4.0

Optional: Data Parallelism — PyTorch Tutorials 1.13.1+cu117 ...

In this tutorial, we will learn how to use multiple GPUs using DataParallel . It's very easy to use GPUs with PyTorch. You...

Distributed data parallel training in Pytorch

The easiest way to speed up neural network training is to use a GPU, which provides large speedups over CPUs on the types...

PyTorch 101, Part 4: Memory Management and Using Multiple ...

Moving tensors around CPU / GPUs ... Every Tensor in PyTorch has a to() member function. It's job is to put the tensor...

PyTorch Distributed: Experiences on Accelerating Data ...

... design, implementation, and evalu- ation of the PyTorch distributed data parallel module. ... CPU input tensors to eliminate the overhead of copying...