COCO AP of FPN with ResNet-50 backbone for object detection
See original GitHub issueHi @fmassa, thanks for the great codes.
I am confused about COCO AP of Faster R-CNN ResNet-50 FPN
,
from Document and #925 and Source Code,
I guess that the model Faster R-CNN ResNet-50 FPN
was trained with following hyperparameters and got AP 37.0, am I right?
Repo | Network | box AP | scheduler | epochs | lr-steps | batch size | lr |
---|---|---|---|---|---|---|---|
vision | R-50 FPN | 37.0 | 2x | 26 | 16, 22 | 16 | 0.02 |
batch_size = 2 * 8 (NUM_GPU) = 16
However, I noticed that the box AP in maskrcnn-benchmark and Detectron seems to have better performance as below:
Repo | Network | box AP | scheduler | epochs | lr-steps | batch size | lr |
---|---|---|---|---|---|---|---|
maskrcnn-benchmark | R-50 FPN | 36.8 | 1x | 12.28 | 8.19, 10.92 | 16 | 0.02 |
Detectron | R-50 FPN | 36.7 | 1x | 12.28 | 8.19, 10.92 | 16 | 0.02 |
Detectron | R-50 FPN | 37.9 | 2x | 24.56 | 16.37, 21.83 | 16 | 0.02 |
from maskrcnn-benchmark 1x config epochs = 90000 (steps) * 16 (batch size) / 117266 (training images per epoch) = 12.28 btw, COCO2017 has 118287 training images but only 117266 training images contain at least one object
I would like to know what causes this gap?
- 37.0 (torchvision 2x) vs 36.8 (maskrcnn-benchmark 1x)
- 37.0 (torchvision 2x) vs 37.9 (Detectron 2x)
Besides, could I have the result which trained with scheduler 1x?
Repo | Network | box AP | scheduler | epochs | lr-steps | batch size | lr |
---|---|---|---|---|---|---|---|
vision | R-50 FPN | ?? | 1x | 13 | 8, 11 | 16 | 0.02 |
Thank you!
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Simple Training Strategies and Model Scaling for Object ...
Table 1: Ablation study of the modern techniques discussed in this paper. Results are reported using a RetinaNet detector with a ResNet-50 backbone...
Read more >arXiv:2107.00057v1 [cs.CV] 30 Jun 2021
as the backbone for object detection and instance segmen- tation systems. ... a ResNet152-FPN backbone achieves 52.9% AP on COCO.
Read more >Trident Pyramid Networks: The importance of processing at ...
Keywords: feature pyramid, network architecture, object detection, deep learning ... a ResNet-101+FPN baseline with our ResNet-50+TPN network by 1.7 AP, ...
Read more >Understanding Feature Pyramid Networks for object detection ...
We use the ROIs and the feature map layer to create feature patches to be fed into the ROI pooling. In FPN, we...
Read more >DetNAS: Backbone Search for Object Detection - NIPS papers
ImageNet Classification. Object Detection with FPN on COCO. Backbone. FLOPs. Accuracy. mAP AP50. AP75. APs. APm. APl. ResNet-50. 3.8G.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Sure, I’ve sent PR #2113 for this.
Hi,
There are a few differences between both implementations that lead to this difference in mAP:
Those all cumulate to lead to this discrepancy that you see. Given the complexity of Faster R-CNN as a model, every tiny detail can change a bit the dynamics of the training, while producing in the end (after more epochs) comparable models, so for the sake of uniformity and simplicity, we decided to make this compromise.
IIRC, training on the 1x schedule gives ~36.3 mAP, but I can’t find the logs anymore and would need to re-train the model to be sure.
Let me know if you have more questions!