Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is there a way to use images with no bounding boxes to train a FasterRCNN?

See original GitHub issue

Hi, I am currently using version 1.7.1 of pytorch with the corresponding version of torchvision. Everything is going perfect while training a FasterRCNN FPN 50 using your reference training project. That is until I added images with no bounding boxes to the training data, then the whole thing broke down.

I need background images with no bounding boxes because the model needs to learn that certain types of images will not have objects in them, and if I only feed it images with objects in them, it will never see these types of images and predict nonsense when it gets them during inference.

Is this feature supported? The error I currently get when trying to train is:

/home/joaqo/.local/lib/python3.6/site-packages/torchvision/models/detection/generalized_rcnn.py 69 forward
boxes.shape))

ValueError:
Expected target boxes to be a tensorof shape [N, 4], got torch.Size([0]).

Which makes perfect sense if you want to force your model to only train on images with targets, regrettably thats not my use case. If this is not currently supported? Would anyone provide some pointers for me to add support for this?

Thanks.

Issue Analytics

State:
Created 2 years ago
Comments:12 (12 by maintainers)

Top GitHub Comments

4reactions

oke-adityacommented, Jun 27, 2021

Hi @joaqo !

Yes it is possible to use images with no bounding boxes in training data with torchvision. This feature was implemented from release 0.6 Here is a note on how to use it.

If you have background image with No bounding boxes at all. The boxes are assumed to be a [0, 0, 0, 0], hence the area is 0. Also the label is assumed as background label which is 0 (predefined by torchvision models).

For e.g if such image has image_id = 3. So for such images you need to pass.

{
"image_id" : 3,
"labels": torch.zeros(0, dtype=torch.int64),
"boxes": torch.zeros((0, 4), dtype=torch.float32,
"area": torch.zeros(0, dtype=torch.float32),
}

I think the reference script will work out of box, if you make changes to your dataset.

Feel free to post problems you face in this issue.

1reaction

joaqocommented, Jul 1, 2021

Internally FasterRCNN (and many other object detection models) divides the image into many, sometimes overlapping, areas called anchors. For each of these anchors it predicts if they’re one of the foreground classes or the background class. When training, the model grabs a bunch of these anchors in different strategic ways, tries to balance a decent propotion between foreground and background classes (look here and here for the proportions in FasterRCNN), and computes the loss function using them.

So the model is trying to learn how well it did on the anchors it predicted as foreground but also on the ones it predicted as background. So, you see that training on an image with no annotations is not really a problem, it’s as if you were just setting the proportion of foreground/background to just be background during that one sample. You cannot do it for all samples, but for a subset of them its fine.

Top Results From Across the Web

Can I include images without target objects (aka no bounding ...

I am wondering if I could include images without target objects (a.k.a. no bounding box) into the training dataset? Can the well-trained faster- ......

Train object detection without annotated data/bounding boxes

Our scheme iterates between re-training the detector, re-localizing objects in the training images, and human verification. We use the ...

Locating Objects Without Bounding Boxes - CVF Open Access

The described methods either require groundtruthed bound- ing boxes to train the CNNs or require to set the maximum number of objects in...

Understanding and Implementing Faster R-CNN: A Step-By ...

In the final step of inference, we remove duplicate bounding boxes using a technique called non-max suppression. In this technique, we first ...

Train your own object detector with Faster-RCNN & PyTorch

I will show you how images that were downloaded from the internet can be used to generate annotations (bounding boxes) with the help...