Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Boxnet getting better results than Votenet

See original GitHub issue

I’m sorry that this is not a general issue but raised this so that it may help people training on their custom datasets.

I’m testing on my custom dataset and experiencing Boxnet getting better results than votenet. To simplify the training and dataset creation process, the dataset is only single class and the heading is single class as well.

Boxnet..
eval mean box_loss: 0.143415
eval mean center_loss: 0.033522
eval mean heading_cls_loss: 0.408684
eval mean heading_reg_loss: 0.021197
eval mean loss: 1.641386
eval mean neg_ratio: 0.340674
eval mean obj_acc: 0.952075
eval mean objectness_loss: 0.041447
eval mean pos_ratio: 0.659326
eval mean sem_cls_loss: 0.000000
eval mean size_cls_loss: 0.000000
eval mean size_reg_loss: 0.047828
0 0.8460753105326817
eval person Average Precision: 0.846075
eval mAP: 0.846075
eval person Recall: 0.984816
eval AR: 0.984816

Votenet
eval mean box_loss: 0.131423
eval mean center_loss: 0.038023
eval mean heading_cls_loss: 0.539432
eval mean heading_reg_loss: 0.001078
eval mean loss: 5.734286
eval mean neg_ratio: 0.920239
eval mean obj_acc: 0.984032
eval mean objectness_loss: 0.015435
eval mean pos_ratio: 0.018091
eval mean sem_cls_loss: 0.000000
eval mean size_cls_loss: 0.000000
eval mean size_reg_loss: 0.038379
eval mean vote_loss: 0.434288
eval person Average Precision: 0.382230
eval mAP: 0.382230
eval person Recall: 0.558568
eval AR: 0.558568

Has anyone experienced similar issues or tips?

Some things I noticed:

the voting loss doesn’t converge so well on the custom dataset.
the pos_ratio is quite low for votenet.

Issue Analytics

State:
Created 4 years ago
Reactions:3
Comments:8 (3 by maintainers)

Top GitHub Comments

5reactions

kentaroy47commented, Oct 30, 2019

@jediofgever https://github.com/facebookresearch/votenet/blob/master/doc/tips.md I followed the instructions here and created a custom sunrgbd/sunrgbd_data.py sunrgbd/model_util_config.py.

The only things you should modify in the votenet codes are the mean size of the classes, which are used to regress the bounding boxes. The main part will be generating the point cloud, bounding box, and vote numpy files. I highly recommend generating the sungrgb dataset first (you need matlab…), and see how those numpy files look.

for fileid: hoge, the dataset should include:

hoge_pc.npy (which has [N,3] or [N,6] pcd data). I didn’t use rgb information.
hoge_bbox.npy (which has [num_obj, 8] binding box annotations.)
hoge_votes.npy (which has [N,10] voting data for each points.)

0reactions

charlesq34commented, Nov 7, 2019

@chinacui

It’s possible. If the scene contains mostly background points, votes may not have very high density, so the difference between boxnet and votenet could be small.

A variation of the method is to predict binary scores for foreground and background classes for each point and then weight point features by the predicted scores (therefore foreground points contribute more to the voting).

Top Results From Across the Web

Deep Hough Voting for 3D Object Detection in Point Clouds

Our study shows that the voting scheme supports more effective context aggregation, and verifies that VoteNet offers the largest improvements when object ...

arXiv:1904.09664v2 [cs.CV] 22 Aug 2019

Voting helps increase detection contexts. Seed points that generate good boxes (BoxNet), or good votes (VoteNet) which in turn generate good ...

Supplemental Material A. Implementation Details

We improve the VoteNet and BoxNet baselines by doing a grid search and improving the optimization hyperparam- eters. We train the baseline models...

Deep Hough Voting for 3D Object Detection in Point Clouds

Paper tables with annotated results for Deep Hough Voting for 3D Object Detection ... Using a learned vote aggregation is far more effective...

A Survey on Transformers for Point Cloud Processing

similar or better performances than other types of networks, ... training, and it is becoming an increasingly important model.