Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

I have some questions. Could you please help me?

See original GitHub issue

Hi @mnslarcher thanks for your excellent repo!

I ran the kmeans_anchors_ratios.py and got the following results.

[12/08 09:51:50] Starting the calculation of the optimal anchors ratios
[12/08 09:51:50] Extracting and preprocessing bounding boxes
[12/08 09:51:50] Discarding 271 bounding boxes with size lower or equal to 0
[12/08 09:51:50] K-Means (1 run):   0%|                           | 0/1 [00:00<?, ?it/s]	Runs avg. IoU: 67.94% ± 0.00% (mean ± std. dev. of 1 runs, 0 skipped)
[12/08 09:51:50] K-Means (1 run): 100%|███████████████████| 1/1 [00:00<00:00,  3.30it/s]
	Avg. IoU between bboxes and their most similar anchors after norm. them to make their area equal (only ratios matter): 67.94%
[12/08 09:51:51] Default anchors ratios: [(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]
	Avg. IoU between bboxes and their most similar default anchors, no norm. (both ratios and sizes matter): 9.64%
	Num. bboxes without similar default anchors (IoU < 0.5):  22514/23553 (95.59%)
[12/08 09:51:51] K-Means anchors ratios: [(0.4, 2.5), (0.8, 1.3), (1.6, 0.6)]
	Avg. IoU between bboxes and their most similar K-Means anchors, no norm. (both ratios and sizes matter): 9.55%
	Num. bboxes without similar K-Means anchors (IoU < 0.5):  22539/23553 (95.69%)
[12/08 09:51:51] Default anchors have an IoU < 50% with bboxes in 0.11% less cases than the K-Means anchors, you should consider stick with them

Process finished with exit code 0

Here are my questions: 1、By reading your code, I think even though the script tells me “I should stick with them” But the Num. bboxes without similar K-Means anchors (IoU < 0.5): 22539/23553 (95.69%) means almost all of the anchor don’t match the boxes. so I think these anchor ratios are very terrible, am I right? 2、If my thought is right, how can I solve this problem to get a better anchor ratios? 3、By the way, what’s the meaning and role of the anchor size?

Thanks for your answer!

Issue Analytics

State:
Created 3 years ago
Comments:12 (6 by maintainers)

Top GitHub Comments

1reaction

HsLOLcommented, Dec 12, 2020

@HsLOL

You are more interested in the quartiles that in the mean.

From the summary above you now know that 25% of you bounding boxes have a size < 5… that is very small (4px per side if your boxes are square).

This means that you would like to have size around 4. This can be achieved changing the anchor scale in a yml like this, as explained here

You can test the result of this changing this part in the tutorial of my repo:
## change the following parameters according to your model:

# EfficientDetD{PHI}
PHI = 0  # for another efficientdet change only this, e.g. PHI = 3 for D3

input_sizes = [512, 640, 768, 896, 1024, 1280, 1280, 1536, 1536]
pyramid_levels = [5, 5, 5, 5, 5, 5, 5, 5, 6]
anchor_scale = [4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 4.0]

scale = anchor_scale[PHI]
strides = 2 ** np.arange(3, pyramid_levels[PHI] + 3)
scales = np.array([2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)])

INPUT_SIZE = input_sizes[PHI]
ANCHORS_SIZES = (scale * scales * strides[:, np.newaxis]).flatten().tolist()
ANCHORS_SIZES
Replace scales = np.array([2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]) with scales = np.array([0.25 ** 0, 0.25 ** (1.0 / 3.0), 0.25 ** (2.0 / 3.0)])

Now if possible (aka if your GPU have enough RAM) is better to use a larger input size (maybe selecting a larger EfficientDet), is very difficult to detect objects 4x4.

If you can’t use the full resolution, another strategy is to divide every images in more than one part, this require some work on the annotations. Especially during inference, you probably want some overlap between the parts to being able to correctly detect a bounding box that otherwise would be split in two.

Thank you for your advice. I will have a try.

0reactions

mnslarchercommented, Dec 11, 2020

@HsLOL

You are more interested in the quartiles that in the mean.

From the summary above you now know that 25% of you bounding boxes have a size < 5… that is very small (4px per side if your boxes are square).

This means that you would like to have size around 4. This can be achieved changing the anchor scale in a yml like this, as explained here

You can test the result of this changing this part in the tutorial of my repo:

## change the following parameters according to your model:

# EfficientDetD{PHI}
PHI = 0  # for another efficientdet change only this, e.g. PHI = 3 for D3

input_sizes = [512, 640, 768, 896, 1024, 1280, 1280, 1536, 1536]
pyramid_levels = [5, 5, 5, 5, 5, 5, 5, 5, 6]
anchor_scale = [4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 4.0, 5.0, 4.0]

scale = anchor_scale[PHI]
strides = 2 ** np.arange(3, pyramid_levels[PHI] + 3)
scales = np.array([2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)])

INPUT_SIZE = input_sizes[PHI]
ANCHORS_SIZES = (scale * scales * strides[:, np.newaxis]).flatten().tolist()
ANCHORS_SIZES

Replace scales = np.array([2 ** 0, 2 ** (1.0 / 3.0), 2 ** (2.0 / 3.0)]) with scales = np.array([0.25 ** 0, 0.25 ** (1.0 / 3.0), 0.25 ** (2.0 / 3.0)])

Now if possible (aka if your GPU have enough RAM) is better to use a larger input size (maybe selecting a larger EfficientDet), is very difficult to detect objects 4x4.

If you can’t use the full resolution, another strategy is to divide every images in more than one part, this require some work on the annotations. Especially during inference, you probably want some overlap between the parts to being able to correctly detect a bounding box that otherwise would be split in two.