Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

The prototype `FiveCrop`/`TenCrop`/`BatchMultiCrop` produce unexpected results

See original GitHub issue

🐛 Describe the bug

The current implementation of the transforms produce unexpected results for Classification.

If a single image comes in (no batch&collated), a batched result comes out with a single label
if a batch of images comes in, we get a new image batch with 5x the original size and a label vector with length equal to the original size

In both cases, the X and the Y lengths don’t match. This creates issues on validation pipelines.

One approach would be to duplicate the length of Labels to ensure they match. It’s quite likely the same approach should be considered for meta-data (like ids) and other content included in the record. Things get even more complex if we consider extending the transform to Detection as we would need to crop the Masks/BBoxes, Cleanup the BBoxes and sync the Labels for each of the new crops.

This complexity is possible the reason why existing the FiveCrop implementation on stable don’t offer a way to stack the result for the user but instead it just provides an example in the documentation on how it can be done. Due to the above, I believe that the current implementation of BatchMultiCrop is flawed and should be either removed or redeveloped.

Versions

Latest main branch

cc @vfdev-5 @datumbox @bjuncek @pmeier

Issue Analytics

State:
Created a year ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

datumboxcommented, Aug 22, 2022

Looks OK to me.

Another thing I would avoid, if you agree, is the use of transforms.MultiCropResult. Why can’t we put everything a simple List that clearly shows to the signature what is the expected input (List[features.Image). I understand that on the future, if we want to provide such a transform, we would need to be able to identify that this is a special type of list. But if this list is not used at the moment, it only gets in the way for the user. Such a future change can happen in a BC manner I think.

Let me know if I’m missing something important here.

0reactions

pmeiercommented, Aug 22, 2022

That would look like

class BatchMultiCrop(transforms.Transform):
    def forward(self, sample: Tuple[transforms.MultiCropResult, features.Label]):
        images, labels = sample
        batch_size = len(images)
        images = features.Image.new_like(images[0], torch.stack(images))
        labels = features.Label.new_like(labels, labels.repeat(batch_size, *[1] * labels.ndim))
        return images, labels

Note that we cannot have a forward(self, images: transforms.MultiCropResult, labels: features.Label) signature, because even if you use transform = transforms.Compose([transforms.FiveCrop(), BatchMultiCrop()]) with transform(image, label), the output of FiveCrop will always be a tuple. The only way to avoid this would be to remove the FiveCrop from the example, but I’m not sure if we aren’t then wading into “useless” example since we showcase something that will not happen in practice and move the burden entirely on the user.