question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

device-side assert from cuda

See original GitHub issue

when I tried point2_sem_seg model to other dataset, the error occurs: device-side assert from cuda this problem probably caused by index out of bound when I searched google. And after I tried run model on CPU mode, it clearly pointed out which line the error occurs:

==========index_points=========
points.shape: torch.Size([8, 512, 3])
idx.shape: torch.Size([8, 1024])
view_shape: [8, 1]
repeat_shape: [1, 1024]
==========index_points=========
points.shape: torch.Size([8, 512, 3])
idx.shape: torch.Size([8, 1024, 32])
view_shape: [8, 1, 1]
repeat_shape: [1, 1024, 32]
Traceback (most recent call last):
  File "train_semseg.py", line 277, in <module>
    main(args)
  File "train_semseg.py", line 180, in main
    seg_pred, trans_feat = classifier(points)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/idriver/work/wt/nuRadarScenes/models/pointnet2_sem_seg.py", line 26, in forward
    l1_xyz, l1_points = self.sa1(l0_xyz, l0_points)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 202, in forward
    new_xyz, new_points = sample_and_group(self.npoint, self.radius, self.nsample, xyz, points)
  File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 135, in sample_and_group
    grouped_xyz = index_points(xyz, idx) # [B, npoint, nsample, C]
  File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 64, in index_points
    new_points = points[batch_indices, idx, :]
RuntimeError: index 512 is out of bounds for dim with size 512

so the idx is out of bound in function index_points where idx produced in function blow:

def query_ball_point(radius, nsample, xyz, new_xyz):
    """
    Input:
        radius: local region radius
        nsample: max sample number in local region
        xyz: all points, [B, N, 3]
        new_xyz: query points, [B, S, 3]
    Return:
        group_idx: grouped points index, [B, S, nsample]
    """
    device = xyz.device
    B, N, C = xyz.shape
    _, S, _ = new_xyz.shape
    group_idx = torch.arange(N, dtype=torch.long).to(device).view(1, 1, N).repeat([B, S, 1])
    sqrdists = square_distance(new_xyz, xyz)
    group_idx[sqrdists > radius ** 2] = N
    group_idx = group_idx.sort(dim=-1)[0][:, :, :nsample]
    group_first = group_idx[:, :, 0].view(B, S, 1).repeat([1, 1, nsample])
    mask = group_idx == N
    group_idx[mask] = group_first[mask]
    return group_idx

when I change group_idx[sqrdists > radius ** 2] = N-1 and mask = group_idx == N-1, it works!

so is that right ?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
Connor323commented, Jul 21, 2020

I think this error is caused by when there is no point lying within the ball given the radius you set. In such case, the group_first variable is also N, which is out of bound.

0reactions
liguanglincommented, May 9, 2022

This issue still happens in the last version, I think it should be reopened and handled differently than just ignoring.

hi, you can try this file. https://github.com/yanx27/Pointnet_Pointnet2_pytorch/blob/master/models/pointnet2_utils.py

Read more comments on GitHub >

github_iconTop Results From Across the Web

CUDA runtime error (59) : device-side assert triggered
One way to raise the "CUDA error: device-side assert triggered" RuntimeError , is by indexing into a GPU torch.Tensor using a list having ......
Read more >
RuntimeError: CUDA error: device-side assert triggered
I'm putting my code here: with torch.no_grad(): retrieval_one_hot = torch.zeros(k, 10).cuda() for batch_idx, (inputs, targets, ...
Read more >
RuntimeError: CUDA error: device-side assert triggered ...
I am training latest version of layoutLMv3 model but while starting training trainer.train() getting bellow error. Please help me to resolve ...
Read more >
[HELP] RuntimeError: CUDA error: device-side assert triggered
I get this error: RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the ......
Read more >
RuntimeError: CUDA error: device-side assert triggered
RuntimeError: CUDA error: device-side assert triggered. Facebook. Twitter. LinkedIn. You are having this problem while doing semantic image ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found