Using custom input image shape
See original GitHub issueCan we use a custom input image shape while training?
I am looking forward to set an input shape of (512, 512, 3)
but anything else that (32, 32, 3)
throws a mismatch error.
Can you explain how to determine the encoder and decoder network parameters?
Thanks!
Issue Analytics
- State:
- Created a year ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Change input shape dimensions for fine-tuning with Keras
In this tutorial, you will learn how to change the input shape tensor dimensions for fine-tuning using Keras. After going through this guide ......
Read more >Is it possible to use a custom input shape with efficient det?
Hi,. I want to use TFLite Model Maker to train custom object detection models. When I try to change the hparam image_size.
Read more >How do custom input_shape for Inception V3 in Keras work?
This is possible because the model is fully convolutional. Convolutions don't care about the image size, they're "sliding filters".
Read more >Keras -- Transfer learning -- changing Input tensor shape
You can do this by creating a new VGG16 model instance with the new input shape new_shape and copying over all the layer...
Read more >Changing input size of pre-trained models in Keras
The function first changes the input shape parameters of the network. At this point the internals of the model have not been registered....
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@pranjal-joshi-cc has @mauicv answered your question above? If so we shall close this issue 🙂
Hey @pranjal-joshi-cc,
I’m not completely sure what you mean here? The choice of the autoencoder is arbitrary except that:
VAEOutlier
this really only applies to the decoder. It needs to ensure that the decoder maps from the latent space of sizelatent_dim
to the same shape as the original input image, so in your case(512, 512, 3)
.In terms of the output shape of the encoder, it doesn’t really matter as long as the capacity is sufficient, basically that you don’t reduce the dimensionality too much. For the architecture I provided above for instance we have:
and the summary is:
So the output shape of the
encoder_net
is(8, 8, 1024)
. Note that theVAEOutlier
adds some Dense layers to theencoder_net
to transform the(8, 8, 1024)
output to the latent space of dimension1024
where you’ve chosenlatent_dim=1024
.The
decoder_net
maps from the latent space of dimension 1024 (in our case) to the output shape(512, 512, 3)
. So it is going to take a vector of lengthlatent_dim
. We want to transform this to a shape that can then easily be scaled up to(512, 512, 3)
. You can do this a number of ways but it’s easiest if we set up theConv2dTranspose
operation to double the size of the height and width at each layer of the network. The reason we choose8*8*1024
is just that this can then be reshaped into(8, 8, 1024)
. We can then upscale this to obtain the output image by applying each of the transpose layers. For instance, given the architecture I suggested above:The latent vector of shape
(1, 1024)
is mapped to a vector of shape(1, 8*8*1024)
which is then reshaped to(1, 8, 8, 1024)
and then upscaled by each of the transpose layers:(1, 8, 8, 1024) -> (1, 16, 16, 1024) -> (1, 32, 32, 516) -> (1, 64, 64, 256) -> (1, 128, 128, 128) -> (1, 256, 256, 64) -> (1, 512, 512, 32) -> (1, 512, 512, 3)
. So(8*8*1024)
is really chosen as a convenience in order to reshape the tensor. Typically we choose image height and width sizes to be powers of 2 just becuase it makes this operation of scaling up and down simpler but in general this doesn’t have to be the case. The formula for the output size of a transpose convolution is documented here.