Training on Dataset with different image sizes
See original GitHub issueHello.
I’m trying to use your implementation to train on my own dataset. I have multiple images with multiple sizes, annotated with the Pascal VOC format.
The thing is I keep getting the following error:
…/ssd_batch_generator.py", line 1182, in generate batch_X = np.array(batch_X) ValueError: could not broadcast input array from shape (224,224,4) into shape (224,224)
I’m using the example of SSD 7 to get started, with this generator configuration:
train_generator = train_dataset.generate(batch_size=batch_size,
shuffle=True,
train=True,
ssd_box_encoder=ssd_box_encoder,
convert_to_3_channels=True,
equalize=False,
brightness=(0.5, 2, 0.5), # Randomly change brightness between 0.5 and 2 with probability 0.5
flip=0.5, # Randomly flip horizontally with probability 0.5
translate=((5, 70), (3, 50), 0.5), # Randomly translate by 5-70 pixels horizontally and 3-50 pixels vertically with probability 0.5
scale=(0.7, 1.4, 0.5), # Randomly scale between 0.7 and 1.4 with probability 0.5
max_crop_and_resize=False,
random_pad_and_resize=False,
random_crop=False,
crop=False,
resize=(224, 224),
gray=False,
limit_boxes=True,
include_thresh=0.4)
val_generator = val_dataset.generate(batch_size=batch_size,
shuffle=True,
train=True,
ssd_box_encoder=ssd_box_encoder,
convert_to_3_channels=True,
equalize=False,
brightness=False,
flip=False,
translate=False,
scale=False,
max_crop_and_resize=False,
random_pad_and_resize=False,
random_crop=False,
crop=False,
resize=(224, 224),
gray=False,
limit_boxes=True,
include_thresh=0.4)
Is there any configuration I should do to avoid this error, my guess it’s possibly because there are multiple sized images on my dataset.
Any help is appreciated thanks! And also thanks for this great repository.
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
How to Handle Images of Different Sizes in a Convolutional ...
Some of these transformations are: Resize - Resize the variable-sized images to the same size image. We can easily implement this using tf....
Read more >How to prepare the varied size input in CNN prediction
One way is to pad the images while training. That is to say, while ...
Read more >TRAINING CONVNETS WITH MIXED IMAGE SIZES FOR ...
In this work, we describe and evaluate a novel mixed-size training regime that mixes several image sizes at training time. We demonstrate that...
Read more >Training a CNN with different image sizes ? : r/deeplearning
Hi, I have a dataset that contains three types of images with different sizes 512x512, 312x312, 442x442. I can resize them to a...
Read more >How to Pick the Optimal Image Size for Training Convolution ...
We all know that choosing the right size depends on the dataset, but the question is how to do it? Visualize the image...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Glad to hear it. Yeah, the generator cannot yet deal with 4-channel images unless all images have 4 channels, I guess I’ll fix that soon. What data format were your images in? What’s the fourth channel? Alpha? Depth?
Yeah, if an image has a fourth channel, that channel will be a special channel (alpha, depth, etc.), so it isn’t possible to artificially create that fourth channel for a 3-channel image in a meaningful way. The best solution for a dataset in which only some images have four channels is therefore to just throw away the fourth channel, which is a relatively inexpensive operation in Numpy compared to what your function above does.
Thanks for the feedback, I’ll close this issue.