question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unet: Handle arbitrarily sized input images.

See original GitHub issue

Currently the Unet model doesn’t handle arbitrary input image sizes.

For example: Here the spatial dimensions of the output don’t match the input

>>> import segmentation_models_pytorch as smp
>>> model = smp.Unet()
>>> img = torch.rand(1, 3, 127, 127)
>>> model(img).shape
torch.Size([1, 1, 128, 128])    # Spatial dims don't match input

And this input throws an error

>>> img = torch.rand(1, 3, 129, 129)
>>> model(img).shape
RuntimeError                              Traceback (most recent call last)
...
...
.../decoder.py in forward(self, x, skip)
     36         x = F.interpolate(x, scale_factor=2, mode="nearest")
     37         if skip is not None:
---> 38             x = torch.cat([x, skip], dim=1)
     39             x = self.attention1(x)
     40         x = self.conv1(x)

RuntimeError: torch.cat(): Sizes of tensors must match except in dimension 1. Got 10 and 9 in dimension 2 (The offending index is 1)

The Problem: When the spatial dims of the input to an encoder downsampling layer isn’t a divisible by 2, there is a round-off of 1 pixel. When this feature is upsampled in the decoder using F.interpolate, the spatial dims won’t match the skip connection.

The Fix: The fix here is to replace F.interpolate with an upsampling layer that takes an additional (optional) argument output_size. It could look as follows:


class Upsample
    def __init__(...)
        ...

    def forward(self, input, output_size)
        upsampled = F.interpolate(...)
        if upsampled.shape != output_size:
            # pad accordingly
        return upsampled

This is however a breaking change for any users (probably near zero) that rely on the current behavior (i.e when the output size doesn’t match the input size)

I’m happy to create a pull request for this if it is agreed that this fix should be added

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
qubvelcommented, Jan 30, 2022

Hi @jenkspt Thanks a lot for the suggestion, it may be a great feature! However, as far as I know, interpolation with scale_factor is more friendly with exporting to other formats. It is a long time since I heard about these issues, maybe now ONNX and jit trace are supporting dynamic shapes? It would be nice to investigate this question before changing current behavior.

0reactions
github-actions[bot]commented, Apr 10, 2022

This issue was closed because it has been stalled for 7 days with no activity.

Read more comments on GitHub >

github_iconTop Results From Across the Web

UNet different image size for the input - PyTorch Forums
This is rather a theoretical question, but I like to know if one can create UNets for different input image sizes.
Read more >
How to make the trained network to handle images of different ...
In order to let the trained network to be able to predict the images with different size, I changed the input layer from...
Read more >
Is it possible to give variable sized images as input to a ...
Can we give images with variable size as input to a convolutional neural network for object detection? If possible, how can we do...
Read more >
How to Handle Images of Different Sizes in a Convolutional ...
Resize - Resize the variable-sized images to the same size image. We can easily implement this using tf.data input pipeline.
Read more >
Architecture of the U-Net (example for 512×512 as the size of ...
Architecture of the U-Net (example for 512×512 as the size of input data). Each white box corresponds to a multi-channel feature map. The...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found