CUDA error: device-side assert triggered while `RandomLinkSplit()`ing data
See original GitHub issueš Describe the bug
Iām encountering a CUDA error with RandomLinkSplit()
on PubMed and some Non-Planetoid datasets like Amazon-Photo and WikiCS:
import torch_geometric.transforms as T
dataset = get_dataset(path, args.dataset) # returns `torch_geometric.datasets.SomeDataset()`
data = dataset[0]
...
# splitting data
split = T.RandomLinkSplit(num_val=0., num_test=0., is_undirected=True, add_negative_train_samples=False)
train_data = split(data)[0] # line 122
When I run my training code on CUDA, assertion errors like āindex out of boundsā occur:
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [465,0,0], thread: [36,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [465,0,0], thread: [37,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [465,0,0], thread: [38,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [465,0,0], thread: [39,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [465,0,0], thread: [40,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [465,0,0], thread: [41,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [465,0,0], thread: [42,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [16,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [17,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [18,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [64,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [65,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [66,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [67,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [68,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [69,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [70,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [71,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [72,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [73,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [74,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [57,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [58,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [59,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [103,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [104,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [105,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [106,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [107,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [108,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [109,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [110,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [111,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [112,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [113,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [114,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [464,0,0], thread: [115,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [62,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [63,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [123,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [124,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [125,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [126,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [232,0,0], thread: [127,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
File ".../train.py", line 318, in <module>
res = test(epoch)
File ".../train.py", line 122, in test
train_data = split(data)[0]
File "/home/dhc/anaconda3/lib/python3.8/site-packages/torch_geometric/transforms/random_link_split.py", line 116, in __call__
add_self_loops(data.edge_index)[0], num_nodes=data.num_nodes,
File "/home/dhc/anaconda3/lib/python3.8/site-packages/torch_geometric/utils/loop.py", line 82, in add_self_loops
N = maybe_num_nodes(edge_index, num_nodes)
File "/home/dhc/anaconda3/lib/python3.8/site-packages/torch_geometric/utils/num_nodes.py", line 25, in maybe_num_nodes
return int(edge_index.max()) + 1 if edge_index.numel() > 0 else 0
RuntimeError: CUDA error: device-side assert triggered
This is weird for me cause no errors pumping out and code proceeds successfully when I set device='cpu'
, or use Cora / Citeseer on CUDA.
Iāve pip-upgraded PyG to 2.1.0.post1 and the issue still exists, but at negative_sampling.py
this time:
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [100,0,0], thread: [0,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [100,0,0], thread: [1,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [100,0,0], thread: [2,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [100,0,0], thread: [3,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
......
/opt/conda/conda-bld/pytorch_1616554793803/work/aten/src/ATen/native/cuda/IndexKernel.cu:142: operator(): block: [44,0,0], thread: [63,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
File "/home/dhc/new_ch/GCA/train.py", line 318, in <module>
res = test(epoch)
File "/home/dhc/new_ch/GCA/train.py", line 122, in test
train_data = split(data)[0]
File "/home/dhc/anaconda3/lib/python3.8/site-packages/torch_geometric/transforms/random_link_split.py", line 207, in __call__
neg_edge_index = negative_sampling(edge_index, size,
File "/home/dhc/anaconda3/lib/python3.8/site-packages/torch_geometric/utils/negative_sampling.py", line 50, in negative_sampling
idx, population = edge_index_to_vector(edge_index, size, bipartite,
File "/home/dhc/anaconda3/lib/python3.8/site-packages/torch_geometric/utils/negative_sampling.py", line 277, in edge_index_to_vector
row, col = row[mask], col[mask]
RuntimeError: CUDA error: device-side assert triggered
Iād appreciate a solution available for PyG==2.0.1. Huge thanks to anyone helping ā„ļø
Environment
- PyG version: 2.0.1, 2.1.0.post1 (Iāve tried on both versions)
- PyTorch version: 1.8.1
- OS: Ubuntu 16.04.7 LTS
- Python version: 3.8
- CUDA/cuDNN version: 11.1
- How you installed PyTorch and PyG (
conda
,pip
, source):pip
Issue Analytics
- State:
- Created a year ago
- Comments:17 (8 by maintainers)
Top Results From Across the Web
CUDA Error: Device-Side Assert Triggered: Solved | Built In
A CUDA error: device-side assert triggered is an error that's often caused when you either have inconsistency between the number of labels andĀ ......
Read more >RuntimeError: CUDA error: device-side assert triggered - GitHub
In my case I have a mask for loss and I need to do something like 1/mask.mean() to scale loss value, and if...
Read more >CUDA error: device-side assert triggered on Colab
I am trying to initialize a tensor on Google Colab with GPU enabled. device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')Ā ...
Read more >How to fix āCUDA error: device-side assert triggeredā error?
When I do inference job on big data. In rare case, it will trigger āCUDA error: device-side assert ... probs = probs.cpu().numpy().
Read more >How to fix 'Cuda error: Device-side assert triggered?
In this article, we're taking a look at the "CUDA error: Device-side assert triggered" when working with Python and PyTorch.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thank god the errors disappeared on my Windows PC CUDA=11.7 with the newest versions of PyTorch and PyG. It turns out the problem is somehow specific to the GPU or OS I previously used. Still wondering why these errors occured but what more can I do lol My sincere gratitude to @EdisonLeeeee, @wwymak and the Creator @rusty1s for answering this issue ā¤ļø
Even
print(torch.randperm(1000000), device=torch.device('cuda:3'))
? Thatās weird. Perhaps you need to try it on other GPUs or OS.