device-side assert from cuda
See original GitHub issuewhen I tried point2_sem_seg model to other dataset, the error occurs:
device-side assert from cuda
this problem probably caused by index out of bound when I searched google. And after I tried run model on CPU mode, it clearly pointed out which line the error occurs:
==========index_points=========
points.shape: torch.Size([8, 512, 3])
idx.shape: torch.Size([8, 1024])
view_shape: [8, 1]
repeat_shape: [1, 1024]
==========index_points=========
points.shape: torch.Size([8, 512, 3])
idx.shape: torch.Size([8, 1024, 32])
view_shape: [8, 1, 1]
repeat_shape: [1, 1024, 32]
Traceback (most recent call last):
File "train_semseg.py", line 277, in <module>
main(args)
File "train_semseg.py", line 180, in main
seg_pred, trans_feat = classifier(points)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet2_sem_seg.py", line 26, in forward
l1_xyz, l1_points = self.sa1(l0_xyz, l0_points)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 202, in forward
new_xyz, new_points = sample_and_group(self.npoint, self.radius, self.nsample, xyz, points)
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 135, in sample_and_group
grouped_xyz = index_points(xyz, idx) # [B, npoint, nsample, C]
File "/home/idriver/work/wt/nuRadarScenes/models/pointnet_util.py", line 64, in index_points
new_points = points[batch_indices, idx, :]
RuntimeError: index 512 is out of bounds for dim with size 512
so the idx is out of bound in function index_points
where idx produced in function blow:
def query_ball_point(radius, nsample, xyz, new_xyz):
"""
Input:
radius: local region radius
nsample: max sample number in local region
xyz: all points, [B, N, 3]
new_xyz: query points, [B, S, 3]
Return:
group_idx: grouped points index, [B, S, nsample]
"""
device = xyz.device
B, N, C = xyz.shape
_, S, _ = new_xyz.shape
group_idx = torch.arange(N, dtype=torch.long).to(device).view(1, 1, N).repeat([B, S, 1])
sqrdists = square_distance(new_xyz, xyz)
group_idx[sqrdists > radius ** 2] = N
group_idx = group_idx.sort(dim=-1)[0][:, :, :nsample]
group_first = group_idx[:, :, 0].view(B, S, 1).repeat([1, 1, nsample])
mask = group_idx == N
group_idx[mask] = group_first[mask]
return group_idx
when I change group_idx[sqrdists > radius ** 2] = N-1 and mask = group_idx == N-1, it works!
so is that right ?
Issue Analytics
- State:
- Created 4 years ago
- Comments:5
Top Results From Across the Web
CUDA runtime error (59) : device-side assert triggered
One way to raise the "CUDA error: device-side assert triggered" RuntimeError , is by indexing into a GPU torch.Tensor using a list having ......
Read more >RuntimeError: CUDA error: device-side assert triggered
I'm putting my code here: with torch.no_grad(): retrieval_one_hot = torch.zeros(k, 10).cuda() for batch_idx, (inputs, targets, ...
Read more >RuntimeError: CUDA error: device-side assert triggered ...
I am training latest version of layoutLMv3 model but while starting training trainer.train() getting bellow error. Please help me to resolve ...
Read more >[HELP] RuntimeError: CUDA error: device-side assert triggered
I get this error: RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the ......
Read more >RuntimeError: CUDA error: device-side assert triggered
RuntimeError: CUDA error: device-side assert triggered. Facebook. Twitter. LinkedIn. You are having this problem while doing semantic image ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

I think this error is caused by when there is no point lying within the ball given the radius you set. In such case, the
group_firstvariable is alsoN, which is out of bound.hi, you can try this file. https://github.com/yanx27/Pointnet_Pointnet2_pytorch/blob/master/models/pointnet2_utils.py