all-zero gt_boxes are loaded and cause runtime error
See original GitHub issueHi, team,
I noticed that in dataset.py the following code was added for preventing loading a new frame without gt_boxes.
if len(data_dict['gt_boxes']) == 0:
new_index = np.random.randint(self.__len__())
return self.__getitem__(new_index)
But it still happens that sometimes the gt_boxes in batch_dict for certain frames are all zeros before being processed by the model. Some of the operations (such as torch.max()) cannot be accomplished with the situation above.
Here I added the following code in point_rcnn.py at the beginning of forword(),
def forward(self, batch_dict):
for gt_box in batch_dict['gt_boxes']:
if gt_box.max() == gt_box.min() == 0:
pdb.set_trace()
...
Everytime it stopped here, I printed batch_dict['gt_boxes'] and would find one of the frames with gt_boxes all zeros as the following.
(Pdb) gt_boxes = batch_dict['gt_boxes']
(Pdb) gt_boxes[0]
tensor([[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]], device='cuda:0')
As PointRCNN use PointResidualCoder, if gt_boxes is all-zero, no foreground gt_boxes will be selected and the input of PointResidualCoder will be empty. Then the following assert in encode_torch() will raise RuntimeError as follows.
assert gt_classes.max() <= self.mean_size.shape[0]
RuntimeError: invalid argument 1: cannot perform reduction function max on tensor with no elements because the operation does not have an identity at /opt/conda/conda-bld/pytorch_1587428270644/work/aten/src/THC/generic/THCTensorMathReduce.cu:85
Here I trained the model with CLASS_NAMES: ['Cyclist'].
Is it normal and what could be the possible reason behind?
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (1 by maintainers)

Top Related StackOverflow Question
I reviewed the code and found that the problem is from
DataBaseSampler.As there computes the IoUs of sampled boxes in
__call__()(database_sampler.py) and only select boxes which are not overlapped with others,valid_mask(line 188) could be empty.So when I choose only one class for training, such as
Cyclist, even the following constrain in dataset.py (line 127) is satisfied, all the remained boxes could be anything butCyclist.When it runs to the following code in line 132 (dataset.py),
selectedwill be empty and finally nogt_boxescan be selected.So I changed the code in line 127 (dataset.py) from
to
It seems that it works.
Hi all,
This bug has been fixed in https://github.com/open-mmlab/OpenPCDet/pull/340.
Actually I moved the empty check to the end of the
prepare_datafunction since data_processor could also modify the gt_boxes.The error in
PointResidualCoderis another bug and I also fixed it in this PR.Note that both of these two errors don’t affect the performance.
Thank you all for the bug information.