question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Data Loading/sending to GPU error?

See original GitHub issue

❓ Questions & Help

This doesn’t seem to be a bug with PyTorch Geometric (but I may be mistaken), but I am using the PyTorch Geometric DataLoader and encountering this issue. When I do

pbar = tqdm(train_loader)
for data in pbar:
    data = data.to(device)

where data is initially from a DataLoader (train_loader = DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True, num_workers=args.num_workers), I am getting the following error:

0%|          | 0/5081 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/n/app/python/3.7.4/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/n/app/python/3.7.4/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/home/vym1/nn2/lib/python3.7/site-packages/torch/multiprocessing/reductions.py", line 134, in reduce_tensor
    raise RuntimeError("Cowardly refusing to serialize non-leaf tensor which requires_grad, "
RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries.  If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).

I’ve never see this before, and when running on a different dataset with a different model, it works fine. Do you know why this is occurring?

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
rusty1scommented, Jan 7, 2021

I’m really not sure, but I believe setting num_workers=0 should fix this.

1reaction
rusty1scommented, Jan 10, 2021
serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries.  If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).

num_workers=0 uses the main thread to load data, so autograd can not fail when computing gradients across process boundaries.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Data Loading/sending to GPU error? · Issue #1967 - GitHub
Mh, there seems to be a tensor inside data requiring gradients, which leads to this error in conjunction with the data loader.
Read more >
How to load all data into GPU for training - PyTorch Forums
Hi, I am using a set of 1D data for training and I noticed that GPU usage is quite low (<5%) and training...
Read more >
Using the GPU – Machine Learning on GPU - GitHub Pages
How do I send my data to the GPU? ... Objectives. Learn how to move data between the CPU and the GPU. Be...
Read more >
Sending a Dataset or DatasetDict to a GPU - Beginners
I have put my own data into a DatasetDict format as follows: df2 ... don't try and send the train and eval datasets...
Read more >
RuntimeError: Input type (torch.FloatTensor) and weight type ...
You get this error because your model is on the GPU, but your data is on the CPU. So, you need to send...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found