Problems using custom data sets
See original GitHub issueI am trying to use my own Lidar data to test PV-RCNN instead of kitti data, I used similar kaggle annotations However, I get an error when trying to run the code and the error message is as follows
File "***/OpenPCDet/pcdet/datasets/innovusion/innovusion_dataset.py", line 77, in __getitem__
data_dict = self.prepare_data(data_dict=input_dict)
File "***/OpenPCDet/pcdet/datasets/dataset.py", line 124, in prepare_data
'gt_boxes_mask': gt_boxes_mask
File "***/OpenPCDet/pcdet/datasets/augmentor/data_augmentor.py", line 93, in forward
data_dict = cur_augmentor(data_dict=data_dict)
File "***/OpenPCDet/pcdet/datasets/augmentor/database_sampler.py", line 179, in __call__
sampled_boxes = np.stack([x['box3d_lidar'] for x in sampled_dict], axis=0).astype(np.float32)
File "<__array_function__ internals>", line 6, in stack
File "***/anaconda3/envs/ml/lib/python3.7/site-packages/numpy/core/shape_base.py", line 423, in stack
raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack
I located the code and found that it was related to data enhancement, in pcdet/datasets/augmentor/database_sampler.py
def __call__(self, data_dict):
"""
Args:
data_dict:
gt_boxes: (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]
Returns:
"""
gt_boxes = data_dict['gt_boxes']
gt_names = data_dict['gt_names'].astype(str)
existed_boxes = gt_boxes
total_valid_sampled_dict = []
for class_name, sample_group in self.sample_groups.items():
if self.limit_whole_scene:
num_gt = np.sum(class_name == gt_names)
sample_group['sample_num'] = str(int(self.sample_class_num[class_name]) - num_gt)
if int(sample_group['sample_num']) > 0:
sampled_dict = self.sample_with_fixed_number(class_name, sample_group) ### need help
sampled_boxes = np.stack([x['box3d_lidar'] for x in sampled_dict], axis=0).astype(np.float32)
if self.sampler_cfg.get('DATABASE_WITH_FAKELIDAR', False):
sampled_boxes = box_utils.boxes3d_kitti_fakelidar_to_lidar(sampled_boxes)
iou1 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], existed_boxes[:, 0:7])
iou2 = iou3d_nms_utils.boxes_bev_iou_cpu(sampled_boxes[:, 0:7], sampled_boxes[:, 0:7])
iou2[range(sampled_boxes.shape[0]), range(sampled_boxes.shape[0])] = 0
iou1 = iou1 if iou1.shape[1] > 0 else iou2
valid_mask = ((iou1.max(axis=1) + iou2.max(axis=1)) == 0).nonzero()[0]
valid_sampled_dict = [sampled_dict[x] for x in valid_mask]
valid_sampled_boxes = sampled_boxes[valid_mask]
existed_boxes = np.concatenate((existed_boxes, valid_sampled_boxes), axis=0)
total_valid_sampled_dict.extend(valid_sampled_dict)
sampled_gt_boxes = existed_boxes[gt_boxes.shape[0]:, :]
if total_valid_sampled_dict.__len__() > 0:
data_dict = self.add_sampled_boxes_to_scene(data_dict, sampled_gt_boxes, total_valid_sampled_dict)
data_dict.pop('gt_boxes_mask')
return data_dict
Then the key function is sample_with_fixed_number(self, class_name, sample_group)
def sample_with_fixed_number(self, class_name, sample_group):
"""
Args:
class_name:
sample_group:
Returns:
"""
sample_num, pointer, indices = int(sample_group['sample_num']), sample_group['pointer'], sample_group['indices']
if pointer >= len(self.db_infos[class_name]):
indices = np.random.permutation(len(self.db_infos[class_name]))
pointer = 0
sampled_dict = [self.db_infos[class_name][idx] for idx in indices[pointer: pointer + sample_num]]
pointer += sample_num
sample_group['pointer'] = pointer
sample_group['indices'] = indices
return sampled_dict
Self.db_infos is used in the code, it is specified by sampler_cfg.DB_INFO_PATH, but my data dose not have it, so I am stuck here, what do I need to do to fix it, or is there a detailed explanation for me to understand this code Note: My data annotation format
id confidence center_x center_y center_z width length height yaw class_name
thank you all
Issue Analytics
- State:
- Created 3 years ago
- Comments:41 (3 by maintainers)
Top Results From Across the Web
Custom datasets: Everything you need to know - Help Center
Custom datasets allow you to design your own dataset from scratch, allowing you to analyze data for just about anything! You define the...
Read more >Fine-tuning with custom datasets - Hugging Face
This tutorial will take you through several examples of using Transformers models with your own datasets. The guide shows one of many valid...
Read more >How I made my custom dataset for machine learning - Medium
If we were to train on the whole data, we would have a problem that is called overfitting basically the algorithm learns and...
Read more >Datasets and schemas - Amazon Personalize
The keywords can't be in your data. Domain dataset group datasets have additional requirements based on both domain and dataset type. Custom dataset...
Read more >Using data attributes - Learn web development | MDN
Custom attributes are also supported in SVG 2; see SVGElement.dataset and data-* for more information. How to use HTML data attributes ( ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

@Gltina Please make sure:
POINT_CLOUD_RANGE defines the range where space should be voxelized or in other words, space which contain points you assume to be relevant and you have annotations for. For KITTI this space is around 40m to the sides, 70m to the front, 3m below and 1m above the sensor, hence
[0, -39.68, -3, 69.12, 39.68, 1].VOXEL_SIZE defines the
[length, width, height]of each voxel, since PointPillars uses pillars instead of voxels, the height of a voxel is set to the full height of your point cloud range. For the KITTI frames the default length and width of a voxel is set to 16cm, hence[0.16, 0.16, 4].I hope this helps.