RuntimeError: CUDA error: an illegal memory access was encountered
See original GitHub issue File "examples/sem_seg_sparse/train.py", line 142, in <module>
main()
File "examples/sem_seg_sparse/train.py", line 61, in main
train(model, train_loader, optimizer, scheduler, criterion, opt)
File "examples/sem_seg_sparse/train.py", line 79, in train
out = model(data)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/content/drive/My Drive/deep_gcns_torch/examples/sem_seg_sparse/architecture.py", line 69, in forward
feats.append(self.gunet(feats[-1],edge_index=edge_index ,batch=batch))
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch_geometric/nn/models/graph_unet.py", line 83, in forward
x.size(0))
File "/usr/local/lib/python3.6/dist-packages/torch_geometric/nn/models/graph_unet.py", line 120, in augment_adj
num_nodes)
File "/usr/local/lib/python3.6/dist-packages/torch_sparse/spspmm.py", line 30, in spspmm
C = matmul(A, B)
File "/usr/local/lib/python3.6/dist-packages/torch_sparse/matmul.py", line 107, in matmul
return spspmm(src, other, reduce)
File "/usr/local/lib/python3.6/dist-packages/torch_sparse/matmul.py", line 95, in spspmm
return spspmm_sum(src, other)
File "/usr/local/lib/python3.6/dist-packages/torch_sparse/matmul.py", line 83, in spspmm_sum
rowptrA, colA, valueA, rowptrB, colB, valueB, K)
RuntimeError: CUDA error: an illegal memory access was encountered (launch_kernel at /pytorch/aten/src/ATen/native/cuda/Loops.cuh:103)
hi, i’m intergrating the GraphU-Net and other model on the google colab, but there are some bug , could you help me ? thanks.
Issue Analytics
- State:
- Created 3 years ago
- Comments:20 (8 by maintainers)
Top Results From Across the Web
RuntimeError: CUDA error: an illegal memory access was ...
Hi,everyone! I met a strange illegal memory access error. It happens randomly without any regular pattern. The code is really simple.
Read more >PyTorch CUDA error: an illegal memory access was ...
RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API ...
Read more >CUDA error: an illegal memory access was encountered with ...
Try to use the latest PyTorch (1.10). The error indicates an out of bound memory access similar to a segfault on the CPU,...
Read more >PyTorch RuntimeError: CUDA error: an illegal memory access ...
I've designed a network, which gives a weird error. It occurs randomly and can throw an exception in different epochs.
Read more >CUDA error: an illegal memory access was encountered - Part ...
RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The error seems to stem from the fact cuSPARSE cannot handle duplicated edges in
edge_index
. The reason for that is that it fails to compute the correct amount of output edges this way. In your case, it might well be that you have some initial self-loop edges in your graph, which should be removed before callingadd_self_loops
. I think your fix foraugment_adj
is correct, and I added it to theGraphUNet
model in PyG.@vthost @rusty1s Hi, I also met this error when use my own dataset to train
Graph-UNet
. This error randomly occurred when using GPU but never occurred when using CPU. I changed theaugment_adj
function, added theremove_self_loops
function at first, and then the problem was solved. But I don’t know why.