Is it possible to provide the config for centerpoint to train on waymo dataset?
See original GitHub issueAfter referring to the official codes of OpenPCDet and CenterPoint, I wrote a CenterPoint model config trained on Waymo. But the strange thing is that the centerpoint-waymo model I trained on mmdet3D has poor performance. Can someone help me? Thanks!
Here is my config.
model config
voxel_size = [0.32, 0.32, 6]
model = dict(
type='CenterPoint',
pts_voxel_layer=dict(
max_num_points=20, voxel_size=voxel_size, max_voxels=(32000, 32000)),
pts_voxel_encoder=dict(
type='PillarFeatureNet',
in_channels=5,
feat_channels=[64],
with_distance=False,
voxel_size=(0.32, 0.32, 6),
norm_cfg=dict(type='BN1d', eps=1e-3, momentum=0.01),
legacy=False),
pts_middle_encoder=dict(
type='PointPillarsScatter', in_channels=64, output_shape=(512, 512)),
pts_backbone=dict(
type='SECOND',
in_channels=64, # Notice change for multiframe
out_channels=[64, 128, 256],
layer_nums=[3, 5, 5],
layer_strides=[1, 2, 2],
norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01),
conv_cfg=dict(type='Conv2d', bias=False)),
pts_neck=dict(
type='SECONDFPN',
in_channels=[64, 128, 256],
out_channels=[128, 128, 128],
upsample_strides=[1, 2, 4],
norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01),
upsample_cfg=dict(type='deconv', bias=False),
use_conv_for_no_stride=True),
pts_bbox_head=dict(
type='CenterHead',
in_channels=sum([128, 128, 128]),
#in_channels=sum([128, 128, 128, 128, 128, 128]),
tasks = [
dict(num_class=2, class_names=['Car', 'Pedestrian']),
],
common_heads=dict(
reg=(2, 2), height=(1, 2), dim=(3, 2), rot=(2, 2)),
share_conv_channel=64,
bbox_coder=dict(
type='CenterPointBBoxCoder',
post_center_range=[-74.88, -74.88, -2, 74.88, 74.88, 4.0],
max_num=500,
score_threshold=0.1,
out_size_factor=1,
voxel_size=voxel_size[:2],
code_size=7),
separate_head=dict(
type='SeparateHead', init_bias=-2.19, final_kernel=3),
loss_cls=dict(type='GaussianFocalLoss', reduction='mean'),
loss_bbox=dict(type='L1Loss', reduction='mean', loss_weight=2),
norm_bbox=True),
# model training and testing settings
train_cfg=dict(
pts=dict(
grid_size=[512, 512, 1],
voxel_size=voxel_size,
out_size_factor=1,
dense_reg=1,
gaussian_overlap=0.1,
max_objs=500,
min_radius=2,
code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0])),
test_cfg=dict(
pts=dict(
post_center_limit_range=[-80, -80, -10.0, 80, 80, 10.0],
max_per_img=500,
max_pool_nms=False,
min_radius=[4, 12, 10, 1, 0.85, 0.175],
score_threshold=0.1,
pc_range=[-74.88, -74.88],
out_size_factor=1,
voxel_size=voxel_size[:2],
nms_type='rotate',
pre_max_size=4096,
post_max_size=500,
nms_thr=0.7)))
dataset config
data_root = ''
file_client_args = dict(backend='disk')
class_names = ['Car', 'Pedestrian']
point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4]
input_modality = dict(use_lidar=True, use_camera=False)
db_sampler = dict(
data_root=data_root,
info_path=data_root + 'waymo_dbinfos_train.pkl',
rate=1.0,
prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5, Pedestrian=10)),
classes=class_names,
sample_groups=dict(Car=15, Pedestrian=10),
points_loader=dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=6,
use_dim=[0, 1, 2, 3, 4],
file_client_args=file_client_args))
train_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=6,
use_dim=5,
file_client_args=file_client_args),
dict(
type='LoadAnnotations3D',
with_bbox_3d=True,
with_label_3d=True,
with_visibility=False,
file_client_args=file_client_args),
dict(type='ObjectSample', db_sampler=db_sampler),
dict(
type='RandomFlip3D',
sync_2d=False,
flip_ratio_bev_horizontal=0.5,
flip_ratio_bev_vertical=0.5),
dict(
type='GlobalRotScaleTrans',
rot_range=[-0.78539816, 0.78539816],
scale_ratio_range=[0.95, 1.05]),
dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
dict(type='PointShuffle'),
dict(type='DefaultFormatBundle3D', class_names=class_names),
dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=6,
use_dim=5,
file_client_args=file_client_args),
dict(
type='MultiScaleFlipAug3D',
img_scale=(1333, 800),
pts_scale_ratio=1,
flip=False,
transforms=[
dict(
type='GlobalRotScaleTrans',
rot_range=[0, 0],
scale_ratio_range=[1., 1.],
translation_std=[0, 0, 0]),
dict(type='RandomFlip3D'),
dict(
type='PointsRangeFilter', point_cloud_range=point_cloud_range),
dict(
type='DefaultFormatBundle3D',
class_names=class_names,
with_label=False),
dict(type='Collect3D', keys=['points'])
])
]
eval_pipeline = [
dict(
type='LoadPointsFromFile',
coord_type='LIDAR',
load_dim=6,
use_dim=5,
file_client_args=file_client_args),
dict(
type='DefaultFormatBundle3D',
class_names=class_names,
with_label=False),
dict(type='Collect3D', keys=['points'])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type='RepeatDataset',
times=1,
dataset=dict(
type=dataset_type,
data_root=data_root,
ann_file=data_root + 'waymo_infos_train.pkl',
split='training',
pipeline=train_pipeline,
modality=input_modality,
classes=class_names,
test_mode=False,
# we use box_type_3d='LiDAR' in kitti and nuscenes dataset
# and box_type_3d='Depth' in sunrgbd and scannet dataset.
box_type_3d='LiDAR',
# load one frame every five frames
load_interval=5)),
val=dict(
type=dataset_type,
data_root=data_root,
ann_file=data_root + 'waymo_infos_val.pkl',
split='training',
pipeline=test_pipeline,
modality=input_modality,
classes=class_names,
test_mode=True,
box_type_3d='LiDAR'),
test=dict(
type=dataset_type,
data_root=data_root,
ann_file=data_root + 'waymo_infos_val.pkl',
split='training',
pipeline=test_pipeline,
modality=input_modality,
classes=class_names,
test_mode=True,
box_type_3d='LiDAR'))
evaluation = dict(interval=36, pipeline=eval_pipeline)
optimizer config
optimizer = dict(type='Adam', betas=(0.9, 0.99), amsgrad=0.0)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
policy='OneCycle',
max_lr=0.003,
div_factor=10.0, pct_start=0.4,
)
runner = dict(type='EpochBasedRunner', max_epochs=36)
final config
_base_ = [
'../_base_/models/centerpoint_02pillar_second_secfpn_waymo_2cls.py',
'../_base_/datasets/waymoD5-3d-2cls.py',
'../_base_/schedules/cyclic_30e.py',
'../_base_/default_runtime.py',
]
point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4.0]
model = dict(
pts_voxel_layer=dict(point_cloud_range=point_cloud_range),
pts_voxel_encoder=dict(point_cloud_range=point_cloud_range),
pts_bbox_head=dict(bbox_coder=dict(pc_range=point_cloud_range[:2])),
# model training and testing settings
train_cfg=dict(pts=dict(point_cloud_range=point_cloud_range)),
test_cfg=dict(pts=dict(pc_range=point_cloud_range[:2])))
Issue Analytics
- State:
- Created a year ago
- Comments:7
Top Results From Across the Web
CenterPoint-based 3D Object Detection in ONCE Dataset
In this report, we present our optimized point cloud 3D object detection model based on CenterPoint method. CenterPoint detects centers of objects using....
Read more >arXiv:2212.05758v1 [cs.CV] 12 Dec 2022
We use CenterPoint as the 3D detector for fine-tuning. The. 3D encoder is pre-trained on the full Waymo dataset. We can find that...
Read more >Waymo Open Dataset
Waymo is in a unique position to contribute to the research community with one of the largest and most diverse autonomous driving datasets...
Read more >AFDetV2: Rethinking the Necessity of the Second Stage for ...
There have been two streams in the 3D detection from point ... on the Waymo Open Dataset and the nuScenes Dataset. We ......
Read more >Lidar-Camera Deep Fusion for Multi-Modal 3D Object ...
On the Waymo Open Dataset, DeepFusion improves several prevalent 3D detection models such as PointPillars [lang2019pointpillars] , CenterPoints ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@RunpeiDong After weeks of checking, I finally found out the reason. The poor model performance is mainly caused by the intensity of the waymo data. The intensity of waymo ranges from 0-40000 and it should be normalized. Adding the following codes to class ``LoadPointsFromFile(object)‘’ (line 425 in loading.py) can solve the problem.
Meanwhile, the
output_shape
andgrid size
in my config are not correct. They should be (468, 468) and [468, 468, 1] rather and (512, 512) and [512, 512, 1]Thank you very much and I will try your suggestion! It is very reasonable to reduce the voxel size and set two heads to improve performance. Meanwhile, I still wonder why my config leads to very poor performance (0.3-0.5 AP), which is far from the official performances. Note that my config is very similar with the official centerpoint config. Also, I can be sure that there is no problem with my data and evaluation. Because I successfully trained the pointpillars model on waymo and achieved the expected performance.