Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is it possible to provide the config for centerpoint to train on waymo dataset?

See original GitHub issue

After referring to the official codes of OpenPCDet and CenterPoint, I wrote a CenterPoint model config trained on Waymo. But the strange thing is that the centerpoint-waymo model I trained on mmdet3D has poor performance. Can someone help me? Thanks!

Here is my config.

model config

voxel_size = [0.32, 0.32, 6]
model = dict(
    type='CenterPoint',
    pts_voxel_layer=dict(
        max_num_points=20, voxel_size=voxel_size, max_voxels=(32000, 32000)),
    pts_voxel_encoder=dict(
        type='PillarFeatureNet',
        in_channels=5,
        feat_channels=[64],
        with_distance=False,
        voxel_size=(0.32, 0.32, 6),
        norm_cfg=dict(type='BN1d', eps=1e-3, momentum=0.01),
        legacy=False),
    pts_middle_encoder=dict(
        type='PointPillarsScatter', in_channels=64, output_shape=(512, 512)),
    pts_backbone=dict(
        type='SECOND',
        in_channels=64,  # Notice change for multiframe
        out_channels=[64, 128, 256],
        layer_nums=[3, 5, 5],
        layer_strides=[1, 2, 2],
        norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01),
        conv_cfg=dict(type='Conv2d', bias=False)),
    pts_neck=dict(
        type='SECONDFPN',
        in_channels=[64, 128, 256],
        out_channels=[128, 128, 128],
        upsample_strides=[1, 2, 4],
        norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01),
        upsample_cfg=dict(type='deconv', bias=False),
        use_conv_for_no_stride=True),
    pts_bbox_head=dict(
        type='CenterHead',
        in_channels=sum([128, 128, 128]),  
        #in_channels=sum([128, 128, 128, 128, 128, 128]),
        tasks = [
            dict(num_class=2, class_names=['Car', 'Pedestrian']),
        ],
        common_heads=dict(
            reg=(2, 2), height=(1, 2), dim=(3, 2), rot=(2, 2)),
        share_conv_channel=64,
        bbox_coder=dict(
            type='CenterPointBBoxCoder',
            post_center_range=[-74.88, -74.88, -2, 74.88, 74.88, 4.0],
            max_num=500,
            score_threshold=0.1,
            out_size_factor=1,
            voxel_size=voxel_size[:2],
            code_size=7),
        separate_head=dict(
            type='SeparateHead', init_bias=-2.19, final_kernel=3),
        loss_cls=dict(type='GaussianFocalLoss', reduction='mean'),
        loss_bbox=dict(type='L1Loss', reduction='mean', loss_weight=2),
        norm_bbox=True),
    # model training and testing settings
    train_cfg=dict(
        pts=dict(
            grid_size=[512, 512, 1],
            voxel_size=voxel_size,
            out_size_factor=1,
            dense_reg=1,
            gaussian_overlap=0.1,
            max_objs=500,
            min_radius=2,
            code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0])),
    test_cfg=dict(
        pts=dict(
            post_center_limit_range=[-80, -80, -10.0, 80, 80, 10.0],
            max_per_img=500,
            max_pool_nms=False,
            min_radius=[4, 12, 10, 1, 0.85, 0.175],
            score_threshold=0.1,
            pc_range=[-74.88, -74.88],
            out_size_factor=1,
            voxel_size=voxel_size[:2],
            nms_type='rotate',
            pre_max_size=4096,
            post_max_size=500,
            nms_thr=0.7)))

dataset config

data_root = ''
file_client_args = dict(backend='disk')

class_names = ['Car', 'Pedestrian']
point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4]
input_modality = dict(use_lidar=True, use_camera=False)
db_sampler = dict(
    data_root=data_root,
    info_path=data_root + 'waymo_dbinfos_train.pkl',
    rate=1.0,
    prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5, Pedestrian=10)),
    classes=class_names,
    sample_groups=dict(Car=15, Pedestrian=10),
    points_loader=dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=[0, 1, 2, 3, 4],
        file_client_args=file_client_args))

train_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=5,
        file_client_args=file_client_args),
    dict(
        type='LoadAnnotations3D',
        with_bbox_3d=True,
        with_label_3d=True,
        with_visibility=False,
        file_client_args=file_client_args),
    dict(type='ObjectSample', db_sampler=db_sampler),
    dict(
        type='RandomFlip3D',
        sync_2d=False,
        flip_ratio_bev_horizontal=0.5,
        flip_ratio_bev_vertical=0.5),
    dict(
        type='GlobalRotScaleTrans',
        rot_range=[-0.78539816, 0.78539816],
        scale_ratio_range=[0.95, 1.05]),
    dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
    dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
    dict(type='PointShuffle'),
    dict(type='DefaultFormatBundle3D', class_names=class_names),
    dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=5,
        file_client_args=file_client_args),
    dict(
        type='MultiScaleFlipAug3D',
        img_scale=(1333, 800),
        pts_scale_ratio=1,
        flip=False,
        transforms=[
            dict(
                type='GlobalRotScaleTrans',
                rot_range=[0, 0],
                scale_ratio_range=[1., 1.],
                translation_std=[0, 0, 0]),
            dict(type='RandomFlip3D'),
            dict(
                type='PointsRangeFilter', point_cloud_range=point_cloud_range),
            dict(
                type='DefaultFormatBundle3D',
                class_names=class_names,
                with_label=False),
            dict(type='Collect3D', keys=['points'])
        ])
]

eval_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=5,
        file_client_args=file_client_args),
    dict(
        type='DefaultFormatBundle3D',
        class_names=class_names,
        with_label=False),
    dict(type='Collect3D', keys=['points'])
]

data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='RepeatDataset',
        times=1,
        dataset=dict(
            type=dataset_type,
            data_root=data_root,
            ann_file=data_root + 'waymo_infos_train.pkl',
            split='training',
            pipeline=train_pipeline,
            modality=input_modality,
            classes=class_names,
            test_mode=False,
            # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
            # and box_type_3d='Depth' in sunrgbd and scannet dataset.
            box_type_3d='LiDAR',
            # load one frame every five frames
            load_interval=5)),
    val=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'waymo_infos_val.pkl',
        split='training',
        pipeline=test_pipeline,
        modality=input_modality,
        classes=class_names,
        test_mode=True,
        box_type_3d='LiDAR'),
    test=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'waymo_infos_val.pkl',
        split='training',
        pipeline=test_pipeline,
        modality=input_modality,
        classes=class_names,
        test_mode=True,
        box_type_3d='LiDAR'))

evaluation = dict(interval=36, pipeline=eval_pipeline)

optimizer config

optimizer = dict(type='Adam', betas=(0.9, 0.99), amsgrad=0.0)

optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
    policy='OneCycle',
    max_lr=0.003,
    div_factor=10.0, pct_start=0.4,
)

runner = dict(type='EpochBasedRunner', max_epochs=36)

final config

_base_ = [
    '../_base_/models/centerpoint_02pillar_second_secfpn_waymo_2cls.py',
    '../_base_/datasets/waymoD5-3d-2cls.py',
    '../_base_/schedules/cyclic_30e.py',
    '../_base_/default_runtime.py',
]

point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4.0]
model = dict(
    pts_voxel_layer=dict(point_cloud_range=point_cloud_range),
    pts_voxel_encoder=dict(point_cloud_range=point_cloud_range),
    pts_bbox_head=dict(bbox_coder=dict(pc_range=point_cloud_range[:2])),
    # model training and testing settings
    train_cfg=dict(pts=dict(point_cloud_range=point_cloud_range)),
    test_cfg=dict(pts=dict(pc_range=point_cloud_range[:2])))

Issue Analytics

State:
Created a year ago
Comments:7

Top GitHub Comments

9reactions

ZZY816commented, Jul 29, 2022

@RunpeiDong After weeks of checking, I finally found out the reason. The poor model performance is mainly caused by the intensity of the waymo data. The intensity of waymo ranges from 0-40000 and it should be normalized. Adding the following codes to class ``LoadPointsFromFile(object)‘’ (line 425 in loading.py) can solve the problem.

points[:, 3] = np.tanh(points[:, 3])

Meanwhile, the output_shape and grid size in my config are not correct. They should be (468, 468) and [468, 468, 1] rather and (512, 512) and [512, 512, 1]

1reaction

ZZY816commented, Jul 13, 2022

Thank you very much and I will try your suggestion! It is very reasonable to reduce the voxel size and set two heads to improve performance. Meanwhile, I still wonder why my config leads to very poor performance (0.3-0.5 AP), which is far from the official performances. Note that my config is very similar with the official centerpoint config. Also, I can be sure that there is no problem with my data and evaluation. Because I successfully trained the pointpillars model on waymo and achieved the expected performance.

Top Results From Across the Web

CenterPoint-based 3D Object Detection in ONCE Dataset

In this report, we present our optimized point cloud 3D object detection model based on CenterPoint method. CenterPoint detects centers of objects using....

arXiv:2212.05758v1 [cs.CV] 12 Dec 2022

We use CenterPoint as the 3D detector for fine-tuning. The. 3D encoder is pre-trained on the full Waymo dataset. We can find that...

Waymo Open Dataset

Waymo is in a unique position to contribute to the research community with one of the largest and most diverse autonomous driving datasets...

AFDetV2: Rethinking the Necessity of the Second Stage for ...

There have been two streams in the 3D detection from point ... on the Waymo Open Dataset and the nuScenes Dataset. We ......

Lidar-Camera Deep Fusion for Multi-Modal 3D Object ...

On the Waymo Open Dataset, DeepFusion improves several prevalent 3D detection models such as PointPillars [lang2019pointpillars] , CenterPoints ...