question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Questions on reproducing the reported results on MS COCO

See original GitHub issue

Hi,

First, thank you for sharing the exciting work.

I was trying to reproduce the results on MS COCO dataset based on my own training framework. When I used cross entropy loss loss_function=AsymmetricLoss(gamma_neg=0,** gamma_pos=0, clip=0) to achieve the baseline. The result (with backbone of ResNet101) of mAP ~82.5% was achieved, which is quite similar to the result reported in Fig. 8 of the paper.

Then, I replaced the loss function with loss_function=AsymmetricLoss(gamma_neg=4, gamma_pos=1, clip=0.05) – all other hyper parameters were kept consistent. However, I only got the mAP result of ~82.1%.

Also, the traditional focal loss loss_function=AsymmetricLoss(gamma_neg=2, gamma_pos=2, clip=0) can not outperform the baseline (~82.5%), given the same configurations. I am curious about the issue of my training process.

Could you also please share some training tricks? For example, a snippet of code on adjusting learning rate, training transforms similar to that used for validation here, etc. Or, is there any suggestions?

Thank you.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:3
  • Comments:12 (4 by maintainers)

github_iconTop GitHub Comments

9reactions
mrT23commented, Jan 28, 2021

We honestly haven’t encountered any case where ASL has not outperformed easily cross entropy.

here are some training tricks we used (they are quite standard and can be found also in public repositories like this), see if something resonant differently from your framework:

  • for learning rate, we use one cycle policy (warmup + cosine decay) with Adam optimizer and max learning rate of ~2e-4 to 4e-4
  • very important to use also EMA
  • true weight decay of 1e-4 (“true” == no wd for batch norm and bias)
  • we have our own augmentation package, but important to use at least standard AutoAugment.
  • cutout of 0.5 (very important)
  • squish resizing, not crop (important)
  • try replacing resnet with TResNet. it will give you the same GPU speed, with higher accuracy

that’s what I can think of at the top of my head.

5reactions
mrT23commented, Jan 20, 2021

I agree.

We cannot share our training code as-is due to commercial limitations, but once a public code will be shared, we can try to help improve it and get results similar to the ones in the article

Read more comments on GitHub >

github_iconTop Results From Across the Web

Do you report MSCOCO performance? · Issue #1 - GitHub
Has anyone reproduced the result published on the original paper? I tried the code, but the model is converging slowly. The mAP is...
Read more >
Microsoft COCO: Common Objects in Context | SpringerLink
Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old.
Read more >
Lessons learned from the 2015 MSCOCO Image Captioning ...
The others were merely experimental, but are reported here for completeness. Results are available on the Leaderboard of the competition website.
Read more >
Detection results on the MS COCO test-dev set. - ResearchGate
Table 4 shows the results on the MS COCO test-dev set. ScratchDet produces 32.7% AP that is better than all the other methods...
Read more >
COCO-Search18 fixation dataset for predicting goal-directed ...
We introduce COCO-Search18, the first dataset of laboratory-quality goal-directed behavior large enough to train deep-network models.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found