Some code questions
See original GitHub issueHello @pangsu0613, could you please explain with words the idea behind this algorithm to find the overlaps between the projected_3d_boxes (here in the code this is called just ‘boxes’) and the 2d boxes ( here in the code called ‘query_boxes’) a)
# pang added to build the tensor for the second stage of training
@numba.jit(nopython=True,parallel=True)
def build_stage2_training(boxes, query_boxes, criterion, scores_3d, scores_2d, dis_to_lidar_3d,overlaps,tensor_index):
N = boxes.shape[0] #70400
K = query_boxes.shape[0] #30
max_num = 900000
ind=0
ind_max = ind
for k in range(K):
qbox_area = ((query_boxes[k, 2] - query_boxes[k, 0]) *
(query_boxes[k, 3] - query_boxes[k, 1]))
for n in range(N):
iw = (min(boxes[n, 2], query_boxes[k, 2]) -
max(boxes[n, 0], query_boxes[k, 0]))
if iw > 0:
ih = (min(boxes[n, 3], query_boxes[k, 3]) -
max(boxes[n, 1], query_boxes[k, 1]))
if ih > 0:
if criterion == -1:
ua = (
(boxes[n, 2] - boxes[n, 0]) *
(boxes[n, 3] - boxes[n, 1]) + qbox_area - iw * ih)
elif criterion == 0:
ua = ((boxes[n, 2] - boxes[n, 0]) *
(boxes[n, 3] - boxes[n, 1]))
elif criterion == 1:
ua = qbox_area
else:
ua = 1.0
overlaps[ind,0] = iw * ih / ua
overlaps[ind,1] = scores_3d[n,0]
overlaps[ind,2] = scores_2d[k,0]
overlaps[ind,3] = dis_to_lidar_3d[n,0]
tensor_index[ind,0] = k
tensor_index[ind,1] = n
ind = ind+1
elif k==K-1:
overlaps[ind,0] = -10
overlaps[ind,1] = scores_3d[n,0]
overlaps[ind,2] = -10
overlaps[ind,3] = dis_to_lidar_3d[n,0]
tensor_index[ind,0] = k
tensor_index[ind,1] = n
ind = ind+1
elif k==K-1:
overlaps[ind,0] = -10
overlaps[ind,1] = scores_3d[n,0]
overlaps[ind,2] = -10
overlaps[ind,3] = dis_to_lidar_3d[n,0]
tensor_index[ind,0] = k
tensor_index[ind,1] = n
ind = ind+1
if ind > ind_max:
ind_max = ind
return overlaps, tensor_index, ind
b) here when you calculate the feature ‘distance_to_the_lidar’, why do you divide by 82.0 ? https://github.com/pangsu0613/CLOCs/blob/b2f0e23b0deb0e192121bbd563691f49c85d3bbc/second/pytorch/models/voxelnet.py#L497
c) also, I don’t understand why the output scores of the fusion network ‘cls_pred’ are in raw log format even though the input 3d and 2d scores were in sigmoid format. Can you please tell me the reason
Issue Analytics
- State:
- Created 2 years ago
- Comments:21 (9 by maintainers)
Top Results From Across the Web
Top 30 Programming / Coding Interview Questions & Answers
Frequently Asked Basic Programming / Coding Interview Questions · Take an integer array with the numbers from 1 to 100. · Compute the...
Read more >Top 30 Programming questions asked in Interview - Java C ...
1. String Programming Interview Questions · 1) What is the difference between String, StringBuilder, and StringBuffer in Java? (answer) · 2) Why String...
Read more >100+ Coding Interview Questions for Programmers and ...
Top 100 Coding Problems from Programming Job interviews · How is a bubble sort algorithm implemented? ( · How is a merge sort...
Read more >Top 109 Scary Coding Interview Questions SOLVED with ...
The Top 13 General Coding, Design & Programming Fundamentals Questions · 1. What are the pros and cons of your chosen technology? ·...
Read more >Top 75 Programming Interview Questions Answers to Crack ...
1. Array-based Programming Interview Questions · 1. How to find the missing number in a given integer array of 1 to 100? (solution)...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hello @xavidzo (a) The 3D boxes -> 2D boxes projection is not done here. It is done in voxelnet.py. For this function, the input are projected 3D boxes (After projection, there are 8 cornoer points for each box, but we choose the max and min xy to form axis aligned 2D box, this would be in the same format as the 2D detector ouputs), 2D boxes from 2D detector and some related corresponding information. The purpose of this function is to build the input tensor for fusion, since we only care about overlapped projected 3D and 2D detections, so we calculate the IoU between them. The main idea about calculate the IoU is first we check if the 2 boxes have overlaps in x axis direction, if yes, then check y direction, if both yes, then it means they have a overlap, then just calculate the overlapped regions and so on. (b) because in SECOND, the detection field of view in LiDAR coordinate are set as 0<x<70.4, -40<y<40, -3<z<1, so in x-y plane, the longest distance is sqrt(70.4^2 + 40^2), which is around 81, I put 82 to make it smaller than 1, this value does not have a big impact on the final results, 81, 80 also works fine. © because the fusion layers are some CNNs, and the final output layer does not have a ReLU nonlinearity. This is similar to most of the existing detection and classification heads used in other works.
Hello @pangsu0613 , so I trained one CenterPoint model for multi-class fusion, basically as you suggested I used one fusion layer for every class. These are my results:
I trained the clocs fusion layers with the fast focal loss of centerpoint, it converges really fast in one or 2 epochs, more epochs make the results worse. When there is no IoU between the projected 3d bbox and 2d bbox, I changed the default parameter from -10 to -1000, this gave me the best results The results look good in terms of increasing the mAP, however the inference speed is for me still an issue. I measured that running 3 clocs heads takes ~ 3-4 ms, and building the three input tensors (one for each class) takes also 3-4 ms, but also the processing functions before take some time, so in total the delay introduced is around 20 ms.
Could you take a quick look at my code and perhaps spot some ways it could be further optimized?
This is my fusion_utils.py script
And this is my file for the detector, fusion_layer.py script:
I didn’t modify the function def build_stage2_training() to build the input tensor as in numba it runs faster than pytorch. As I said before this takes 3-4 ms in total for three classes, it’s fine, but the processing steps before take longer. I would appreciate any feedback from your side, thank you in advance!