question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Tensors left behind on CPU in DataParallel Implementation

See original GitHub issue

I am encountering an issue with tensors being left behind on the CPU, triggering an assertion error as follows: AssertionError: Gather function not implemented for CPU tensors. This is happening within the process_sample function.

I can confirm that this is strictly associated with DataParallel since passing only a single CUDA device suppresses the error.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
anmatakocommented, Dec 30, 2021

@yyeboah I created PR #62 to address the empty tensors potentially being created on the CPU and this could be the cause of the issue you’re seeing. If you can grab the changes from that PR and test them out that would be great as I don’t have a multi-GPU setup readily available to test this; it works of course on my single GPU machine.

0reactions
yyeboahcommented, Dec 31, 2021

@anmatako I have retrieved PR #62 and after testing, I can confirm that indeed this issue has resolved. Cheers, and have a happy New year !

Read more comments on GitHub >

github_iconTop Results From Across the Web

DataParallel does not work with tensors of dimension 0 #9811
Issue description. I have a network that return a single value, which is a dimensionless tensor as of PyTorch 0.4.0
Read more >
Optional: Data Parallelism — PyTorch Tutorials 1.13.1+cu117 ...
In this tutorial, we will learn how to use multiple GPUs using DataParallel . It's very easy to use GPUs with PyTorch. You...
Read more >
Distributed data parallel training in Pytorch
The easiest way to speed up neural network training is to use a GPU, which provides large speedups over CPUs on the types...
Read more >
PyTorch 101, Part 4: Memory Management and Using Multiple ...
Moving tensors around CPU / GPUs ... Every Tensor in PyTorch has a to() member function. It's job is to put the tensor...
Read more >
PyTorch Distributed: Experiences on Accelerating Data ...
... design, implementation, and evalu- ation of the PyTorch distributed data parallel module. ... CPU input tensors to eliminate the overhead of copying...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found