question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is it possible to provide the config for centerpoint to train on waymo dataset?

See original GitHub issue

After referring to the official codes of OpenPCDet and CenterPoint, I wrote a CenterPoint model config trained on Waymo. But the strange thing is that the centerpoint-waymo model I trained on mmdet3D has poor performance. Can someone help me? Thanks!

Here is my config.

model config

voxel_size = [0.32, 0.32, 6]
model = dict(
    type='CenterPoint',
    pts_voxel_layer=dict(
        max_num_points=20, voxel_size=voxel_size, max_voxels=(32000, 32000)),
    pts_voxel_encoder=dict(
        type='PillarFeatureNet',
        in_channels=5,
        feat_channels=[64],
        with_distance=False,
        voxel_size=(0.32, 0.32, 6),
        norm_cfg=dict(type='BN1d', eps=1e-3, momentum=0.01),
        legacy=False),
    pts_middle_encoder=dict(
        type='PointPillarsScatter', in_channels=64, output_shape=(512, 512)),
    pts_backbone=dict(
        type='SECOND',
        in_channels=64,  # Notice change for multiframe
        out_channels=[64, 128, 256],
        layer_nums=[3, 5, 5],
        layer_strides=[1, 2, 2],
        norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01),
        conv_cfg=dict(type='Conv2d', bias=False)),
    pts_neck=dict(
        type='SECONDFPN',
        in_channels=[64, 128, 256],
        out_channels=[128, 128, 128],
        upsample_strides=[1, 2, 4],
        norm_cfg=dict(type='BN', eps=1e-3, momentum=0.01),
        upsample_cfg=dict(type='deconv', bias=False),
        use_conv_for_no_stride=True),
    pts_bbox_head=dict(
        type='CenterHead',
        in_channels=sum([128, 128, 128]),  
        #in_channels=sum([128, 128, 128, 128, 128, 128]),
        tasks = [
            dict(num_class=2, class_names=['Car', 'Pedestrian']),
        ],
        common_heads=dict(
            reg=(2, 2), height=(1, 2), dim=(3, 2), rot=(2, 2)),
        share_conv_channel=64,
        bbox_coder=dict(
            type='CenterPointBBoxCoder',
            post_center_range=[-74.88, -74.88, -2, 74.88, 74.88, 4.0],
            max_num=500,
            score_threshold=0.1,
            out_size_factor=1,
            voxel_size=voxel_size[:2],
            code_size=7),
        separate_head=dict(
            type='SeparateHead', init_bias=-2.19, final_kernel=3),
        loss_cls=dict(type='GaussianFocalLoss', reduction='mean'),
        loss_bbox=dict(type='L1Loss', reduction='mean', loss_weight=2),
        norm_bbox=True),
    # model training and testing settings
    train_cfg=dict(
        pts=dict(
            grid_size=[512, 512, 1],
            voxel_size=voxel_size,
            out_size_factor=1,
            dense_reg=1,
            gaussian_overlap=0.1,
            max_objs=500,
            min_radius=2,
            code_weights=[1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0])),
    test_cfg=dict(
        pts=dict(
            post_center_limit_range=[-80, -80, -10.0, 80, 80, 10.0],
            max_per_img=500,
            max_pool_nms=False,
            min_radius=[4, 12, 10, 1, 0.85, 0.175],
            score_threshold=0.1,
            pc_range=[-74.88, -74.88],
            out_size_factor=1,
            voxel_size=voxel_size[:2],
            nms_type='rotate',
            pre_max_size=4096,
            post_max_size=500,
            nms_thr=0.7)))

dataset config

data_root = ''
file_client_args = dict(backend='disk')

class_names = ['Car', 'Pedestrian']
point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4]
input_modality = dict(use_lidar=True, use_camera=False)
db_sampler = dict(
    data_root=data_root,
    info_path=data_root + 'waymo_dbinfos_train.pkl',
    rate=1.0,
    prepare=dict(filter_by_difficulty=[-1], filter_by_min_points=dict(Car=5, Pedestrian=10)),
    classes=class_names,
    sample_groups=dict(Car=15, Pedestrian=10),
    points_loader=dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=[0, 1, 2, 3, 4],
        file_client_args=file_client_args))

train_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=5,
        file_client_args=file_client_args),
    dict(
        type='LoadAnnotations3D',
        with_bbox_3d=True,
        with_label_3d=True,
        with_visibility=False,
        file_client_args=file_client_args),
    dict(type='ObjectSample', db_sampler=db_sampler),
    dict(
        type='RandomFlip3D',
        sync_2d=False,
        flip_ratio_bev_horizontal=0.5,
        flip_ratio_bev_vertical=0.5),
    dict(
        type='GlobalRotScaleTrans',
        rot_range=[-0.78539816, 0.78539816],
        scale_ratio_range=[0.95, 1.05]),
    dict(type='PointsRangeFilter', point_cloud_range=point_cloud_range),
    dict(type='ObjectRangeFilter', point_cloud_range=point_cloud_range),
    dict(type='PointShuffle'),
    dict(type='DefaultFormatBundle3D', class_names=class_names),
    dict(type='Collect3D', keys=['points', 'gt_bboxes_3d', 'gt_labels_3d'])
]
test_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=5,
        file_client_args=file_client_args),
    dict(
        type='MultiScaleFlipAug3D',
        img_scale=(1333, 800),
        pts_scale_ratio=1,
        flip=False,
        transforms=[
            dict(
                type='GlobalRotScaleTrans',
                rot_range=[0, 0],
                scale_ratio_range=[1., 1.],
                translation_std=[0, 0, 0]),
            dict(type='RandomFlip3D'),
            dict(
                type='PointsRangeFilter', point_cloud_range=point_cloud_range),
            dict(
                type='DefaultFormatBundle3D',
                class_names=class_names,
                with_label=False),
            dict(type='Collect3D', keys=['points'])
        ])
]

eval_pipeline = [
    dict(
        type='LoadPointsFromFile',
        coord_type='LIDAR',
        load_dim=6,
        use_dim=5,
        file_client_args=file_client_args),
    dict(
        type='DefaultFormatBundle3D',
        class_names=class_names,
        with_label=False),
    dict(type='Collect3D', keys=['points'])
]

data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='RepeatDataset',
        times=1,
        dataset=dict(
            type=dataset_type,
            data_root=data_root,
            ann_file=data_root + 'waymo_infos_train.pkl',
            split='training',
            pipeline=train_pipeline,
            modality=input_modality,
            classes=class_names,
            test_mode=False,
            # we use box_type_3d='LiDAR' in kitti and nuscenes dataset
            # and box_type_3d='Depth' in sunrgbd and scannet dataset.
            box_type_3d='LiDAR',
            # load one frame every five frames
            load_interval=5)),
    val=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'waymo_infos_val.pkl',
        split='training',
        pipeline=test_pipeline,
        modality=input_modality,
        classes=class_names,
        test_mode=True,
        box_type_3d='LiDAR'),
    test=dict(
        type=dataset_type,
        data_root=data_root,
        ann_file=data_root + 'waymo_infos_val.pkl',
        split='training',
        pipeline=test_pipeline,
        modality=input_modality,
        classes=class_names,
        test_mode=True,
        box_type_3d='LiDAR'))

evaluation = dict(interval=36, pipeline=eval_pipeline)

optimizer config

optimizer = dict(type='Adam', betas=(0.9, 0.99), amsgrad=0.0)

optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
    policy='OneCycle',
    max_lr=0.003,
    div_factor=10.0, pct_start=0.4,
)

runner = dict(type='EpochBasedRunner', max_epochs=36)

final config

_base_ = [
    '../_base_/models/centerpoint_02pillar_second_secfpn_waymo_2cls.py',
    '../_base_/datasets/waymoD5-3d-2cls.py',
    '../_base_/schedules/cyclic_30e.py',
    '../_base_/default_runtime.py',
]

point_cloud_range = [-74.88, -74.88, -2, 74.88, 74.88, 4.0]
model = dict(
    pts_voxel_layer=dict(point_cloud_range=point_cloud_range),
    pts_voxel_encoder=dict(point_cloud_range=point_cloud_range),
    pts_bbox_head=dict(bbox_coder=dict(pc_range=point_cloud_range[:2])),
    # model training and testing settings
    train_cfg=dict(pts=dict(point_cloud_range=point_cloud_range)),
    test_cfg=dict(pts=dict(pc_range=point_cloud_range[:2])))

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:7

github_iconTop GitHub Comments

9reactions
ZZY816commented, Jul 29, 2022

@RunpeiDong After weeks of checking, I finally found out the reason. The poor model performance is mainly caused by the intensity of the waymo data. The intensity of waymo ranges from 0-40000 and it should be normalized. Adding the following codes to class ``LoadPointsFromFile(object)‘’ (line 425 in loading.py) can solve the problem.

points[:, 3] = np.tanh(points[:, 3])

Meanwhile, the output_shape and grid size in my config are not correct. They should be (468, 468) and [468, 468, 1] rather and (512, 512) and [512, 512, 1]

1reaction
ZZY816commented, Jul 13, 2022

Thank you very much and I will try your suggestion! It is very reasonable to reduce the voxel size and set two heads to improve performance. Meanwhile, I still wonder why my config leads to very poor performance (0.3-0.5 AP), which is far from the official performances. Note that my config is very similar with the official centerpoint config. Also, I can be sure that there is no problem with my data and evaluation. Because I successfully trained the pointpillars model on waymo and achieved the expected performance.

Read more comments on GitHub >

github_iconTop Results From Across the Web

CenterPoint-based 3D Object Detection in ONCE Dataset
In this report, we present our optimized point cloud 3D object detection model based on CenterPoint method. CenterPoint detects centers of objects using....
Read more >
arXiv:2212.05758v1 [cs.CV] 12 Dec 2022
We use CenterPoint as the 3D detector for fine-tuning. The. 3D encoder is pre-trained on the full Waymo dataset. We can find that...
Read more >
Waymo Open Dataset
Waymo is in a unique position to contribute to the research community with one of the largest and most diverse autonomous driving datasets...
Read more >
AFDetV2: Rethinking the Necessity of the Second Stage for ...
There have been two streams in the 3D detection from point ... on the Waymo Open Dataset and the nuScenes Dataset. We ......
Read more >
Lidar-Camera Deep Fusion for Multi-Modal 3D Object ...
On the Waymo Open Dataset, DeepFusion improves several prevalent 3D detection models such as PointPillars [lang2019pointpillars] , CenterPoints ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found