Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Computation of Intersection over Union is erroneous

See original GitHub issue

🐛 Bug

The intersection over union which is computed during nms is overestimating the actual IoU. This is especially the case for small boundingboxes. It holds for both implementations CPU/CUDA, I will use the CUDA code to explain https://github.com/facebookresearch/maskrcnn-benchmark/blob/24c8c90efdb7cc51381af5ce0205b23567c3cd21/maskrcnn_benchmark/csrc/cuda/nms.cu#L13.

Given two boxes:

boxesA = [0, 0, 10, 10]
boxesB = [1, 1, 11, 11]

The size of both boxes should be 100. The area of intersection between both boxes is the area between [1, 1, 10, 10]. Thus it has the size of 81. The IoU thus should be: 81 / 100 + 100 - 81 = 0.68067.

However, using your IoU implementation the NMS module computes the IoU of 0.704225 for the given boxes. This is due to some oversizing of the box-area which you apply to the intersected area and general box area https://github.com/facebookresearch/maskrcnn-benchmark/blob/24c8c90efdb7cc51381af5ce0205b23567c3cd21/maskrcnn_benchmark/csrc/cuda/nms.cu#L16-L19. By adding + 1 to the intersected area and box sizes you are overestimating the size of respective boxes. The area of the boxesA,B from above becomes 121 and the intersected area of both of them becomes 100. As you can see this already changes the ratio of Overlap/BoxSize from 81/100=0.81 to 100/121=0.82644. This overestimation then further propagates into the computation of the IoU which results in the above mentioned IoU of 0.704225.

The smaller boxes are the higher the influence of this overestimation becomes. If you scale both boxes up by the factor of 100 the influence of the overscaling is reduced and the computed IoU of 0.680926 comes closer to the analytical IoU of 0.68067 .

To Reproduce

Steps to reproduce the behavior:

Add the following test case to https://github.com/facebookresearch/maskrcnn-benchmark/blob/master/tests/test_nms.py . It creates both boxes and calls the NMS procedure. This should return that both boxes are kept as the IoU is below .7 . However it returns that one of the boxes should be removed as the IoU which is computed by your IoU function is .704…


    def test_nms2_cpu(self):
        """ Match unit test UtilsNMSTest.TestNMS1 in
            caffe2/operators/generate_proposals_op_util_nms_test.cc
        """

        boxes = torch.from_numpy(
            np.array(
                [
                    [0, 0, 10, 10],
                    [1, 1, 11, 11],
                ]
            ).astype(np.float32)
        )
        scores = torch.from_numpy(
            np.array(
                [
                    0.9,
                    0.9,
                ]
            ).astype(np.float32)
        )

        gt_indices = np.array(
            [
                1,
                6,
            ]
        )
        from maskrcnn_benchmark.structures.bounding_box import BoxList
        keep_indices = box_nms(boxes, scores, 0.7)
        keep_indices = np.sort(keep_indices)

        # All boxes should be kept as the original IoU will be 0.68
        # This test case will fail
        np.testing.assert_array_equal(keep_indices, np.array([0, 1]))

This second test case shows that once we scale the boxes up by a factor of 100 the IoU gets less errorneous and the behavior is as expected and both boxes are kept as the IoU is below the computed threshold. It further shows that your implementation is not scale-invariant.

    def test_nms_working_cpu(self):
        """ Match unit test UtilsNMSTest.TestNMS1 in
            caffe2/operators/generate_proposals_op_util_nms_test.cc
        """

        boxes = torch.from_numpy(
            np.array(
                [
                    [0, 0, 10, 10],
                    [1, 1, 11, 11],
                ]
            ).astype(np.float32)
        )
        scores = torch.from_numpy(
            np.array(
                [
                    0.9,
                    0.9,
                ]
            ).astype(np.float32)
        )

        gt_indices = np.array(
            [
                1,
                6,
            ]
        )
        # so far all the same. now scaling up the boxes
        keep_indices = box_nms(boxes * 100, scores, 0.7)
        keep_indices = np.sort(keep_indices)
        # All boxes should be kept as the original IoU will be 0.68
        # This test case will pass as the boxes have been scaled by the factor of 100
        np.testing.assert_array_equal(keep_indices, np.array([0, 1]))

Expected behavior

The IoU should be correctly computed and no suppression should take place as the IoU is below the threshold of .7

Environment

PyTorch version: 1.2.0.dev20190528 Is debug build: No CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 16.04.6 LTS GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609 CMake version: version 3.5.1

Python version: 3.6 Is CUDA available: Yes CUDA runtime version: 10.0.130 GPU models and configuration: GPU 0: GeForce GTX TITAN X Nvidia driver version: 418.39 cuDNN version: Could not collect

Versions of relevant libraries: [pip] numpy==1.16.3 [pip] torch==1.2.0.dev20190528 [pip] torchvision==0.2.2 [conda] blas 1.0 mkl
[conda] mkl 2019.3 199
[conda] mkl_fft 1.0.12 py36ha843d7b_0
[conda] mkl_random 1.0.2 py36hd81dba3_0
[conda] pytorch 1.0.1 py3.6_cuda10.0.130_cudnn7.4.2_2 pytorch [conda] pytorch-nightly 1.2.0.dev20190528 py3.6_cuda10.0.130_cudnn7.5.1_0 pytorch [conda] torchvision 0.2.2 py_3 pytorch

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:5 (1 by maintainers)

Top GitHub Comments

1reaction

bernhardschaefercommented, Aug 9, 2019

I think this is a matter of convention, and from my understanding, in this repo the xmax and ymax of the boxes are inclusive.

To stick with your example:

boxesA = [0, 0, 10, 10]
boxesB = [1, 1, 11, 11]

For each x and y, the box boxesA covers the pixels [0,1,...,10]. This means the width and height of boxesA is 11 and not 10, and the area becomes 121

0reactions

harshgrovrcommented, Aug 14, 2019

@rnsandeep can you share your Environment configuration please? I am having version issues.