Using grayscale input images instead of RGB.
See original GitHub issueHello @ajbrock! Thank you so much for making your model available for others to use. I’m trying to re-purpose it at the moment for a research project.
I have a two-fold issue: one piece is data-related, the other architecture-related.
- I am trying to use a dataset of .png grayscale images produced by an analogue-to-digital converter. The image dimensions are 512x512 and there is only 1 class. I have made the following modifications in order to get the dataset loaded: (larcv is the dataset name)
In utils.py
# Convenience dicts
dset_dict = {'larcv_png': dset.ImageFolder, 'larcv_hdf5': dset.ILSVRC_HDF5,
'I32': dset.ImageFolder, 'I64': dset.ImageFolder,
'I128': dset.ImageFolder, 'I256': dset.ImageFolder,
'I32_hdf5': dset.ILSVRC_HDF5, 'I64_hdf5': dset.ILSVRC_HDF5,
'I128_hdf5': dset.ILSVRC_HDF5, 'I256_hdf5': dset.ILSVRC_HDF5,
'C10': dset.CIFAR10, 'C100': dset.CIFAR100}
imsize_dict = {'larcv_png': 512, 'larcv_hdf5': 512,
'I32': 32, 'I32_hdf5': 32,
'I64': 64, 'I64_hdf5': 64,
'I128': 128,
'I128_hdf5': 128,
'I256': 256, 'I256_hdf5': 256,
'C10': 32, 'C100': 32}
root_dict = {'larcv_png': 'larcv_png', 'larcv_hdf5': 'ILSVRC512.hdf5',
'I32': 'ImageNet', 'I32_hdf5': 'ILSVRC32.hdf5',
'I64': 'ImageNet', 'I64_hdf5': 'ILSVRC64.hdf5',
'I128': 'ImageNet', 'I128_hdf5': 'ILSVRC128.hdf5',
'I256': 'ImageNet', 'I256_hdf5': 'ILSVRC256.hdf5',
'C10': 'cifar', 'C100': 'cifar'}
nclass_dict = {'larcv_png': 1, 'larcv_hdf5': 1,
'I32': 1000, 'I32_hdf5': 1000,
'I64': 1000, 'I64_hdf5': 1000,
'I128': 1000, 'I128_hdf5': 1000,
'I256': 1000, 'I256_hdf5': 1000,
'C10': 10, 'C100': 100}
# Number of classes to put per sample sheet
classes_per_sheet_dict = {'larcv_png': 1, 'larcv_hdf5': 1,
'I32': 50, 'I32_hdf5': 50,
'I64': 50, 'I64_hdf5': 50,
'I128': 20, 'I128_hdf5': 20,
'I256': 20, 'I256_hdf5': 20,
'C10': 10, 'C100': 100}
The dataset does serialize and load successfully, but when I check the dimensions of the images inside of the ILSVRC_HDF5
class in datasets.py
using img.shape
, the dimensions show as [3, 512, 512]
.
This leads to a size-mismatch in the forward function of G_D at the line:
D_input = torch.cat([G_z, x], 0) if x is not None else G_z
where G_z.shape = [4, 1, 512, 512]
and x.shape = [4, 3, 512, 512]
- I’ve made the following changes to the
D_arch
dictionary in order to accommodate the 512x512 images:
arch[512] = {'in_channels' : [1] + [ch*item for item in [1, 2, 4, 8, 8, 16, 16]],
'out_channels' : [item * ch for item in [1, 2, 4, 4, 8, 8, 16, 16]],
'downsample' : [True] * 7 + [False],
'resolution' : [512, 256, 128, 64, 32, 16, 8, 4],
'attention' : {2**i: 2**i in [int(item) for item in attention.split('_')]
for i in range(2,10)}}
I have also modified the last layer of the Generator to output 1-channel images:
# output layer: batchnorm-relu-conv.
# Consider using a non-spectral conv here
self.output_layer = nn.Sequential(layers.bn(self.arch['out_channels'][-1],
cross_replica=self.cross_replica,
mybn=self.mybn),
self.activation,
self.which_conv(self.arch['out_channels'][-1], 1))
My questions are:
- How can I get the images to load with only 1 channel?
- Are the architecture modifications I’ve made appropriate?
Thank you so much.
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (1 by maintainers)
Top GitHub Comments
@nassarofficial From the readme, do the following:
“To prep your own dataset, you will need to add it to datasets.py and modify the convenience dicts in utils.py (dset_dict, imsize_dict, root_dict, nclass_dict, classes_per_sheet_dict) to have the appropriate metadata for your dataset. Repeat the process in prepare_data.sh (optionally produce an HDF5 preprocessed copy, and calculate the Inception Moments for FID).”
You can see the modifications I made for my dataset in my original post. If your images are larger than 256x256, you’ll have to modify the discriminator architecture accordingly.
@kseuro how did you load your own dataset?