Loading frames to 3D CNN
See original GitHub issueHi, everyone,
I’m trying to load frames from a dataset to an 3D Convolutional Neural Network. I wrote an algorithm to extract frames from videos of the UCF101 Action Recognition dataset, 40 frames per video, so basically i have a new dataset with subfolders representing classes, and inside each class folder i have 40 frames per videos. So, to detail, if i have 50 videos in a class folder, i have 50*40 frames.
The model that i’m using is coded bellow:
`def cnn_3d(self):
"""
The 3D CNN method
"""
#Layers
model = Sequential()
model.add(Conv3D(32, (3,3,3), activation='relu', input_shape=(40, 80, 80, 3)))
model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
model.add(Conv3D(64, (3,3,3), activation='relu'))
model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
model.add(Conv3D(128, (3,3,3), activation='relu'))
model.add(Conv3D(128, (3,3,3), activation='relu'))
model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
model.add(Conv3D(256, (2,2,2), activation='relu'))
model.add(Conv3D(256, (2,2,2), activation='relu'))
model.add(MaxPooling3D(pool_size=(1, 2, 2), strides=(1, 2, 2)))
#FC Layers
model.add(Flatten())
model.add(Dense(1024))
model.add(Dropout(0.5))
model.add(Dense(1024))
model.add(Dropout(0.5))
model.add(Dense(self.n_classes, activation='softmax'))
return model`
And i’m trying to load 40 frames at a time to train the network. Which is the best way to do this is Keras? Just setting the input_shape as a tuple with (frames, w, h, color) ? The 3D input shape for 3D CNN takes 40 frames at a time?
I’m trying to use ImageDataGenerator to fit the model:
`train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.0,
zoom_range=0.0,
horizontal_flip=False,
featurewise_center=False,
featurewise_std_normalization=False,
rotation_range=0.0,
width_shift_range=0.0,
height_shift_range=0.0)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
'/train/',
target_size=(80, 80),
batch_size=32,
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
'/test/',
target_size=(80, 80),
batch_size=32,
class_mode='categorical')
model.fit_generator(
train_generator,
steps_per_epoch=1000,
epochs=30,
validation_data=validation_generator,
validation_steps=1000)`
And i’m getting this error:
`ValueError: Error when checking input: expected conv3d_62_input to have 5 dimensions, but got array with shape (32, 80, 80, 3)`
Someone can help me? Thanks for the support and attention!
Issue Analytics
- State:
- Created 5 years ago
- Reactions:2
- Comments:15
I wrote using OpenCV to preprocess frames in order to create a big tensor of images (or batches of image, like sets of 15 frames, for example).
Use this snippet:
You can use NumPy to reshape your final dataset to (batch, frames, height, width, channels), as nicholasding said.
So, if you have 10000 samples in total, using sets of 10 frames per input, with 30 x 30 dimensions and 1 channel of color, you may reshape you X_data like:
X_data = X_data.reshape(10000, 10, 30, 30, 1) #this way you get 5 dimensions. :)
Note that in the snippet shown above (using OpenCV), a target of one is appended in a separated array (y_data) that corresponds to each set of frames in the X_data. In this case, this target is 1. If you are working with more than one class (obviously), you may repeat this code appending other numbers to the y_data, at the end of the preprocessing, stack the X_datas to get you complete X, you can do this by using the vstack() numpy function, like this:
This way you’ll get your final X_data with 10000 samples, repeat this process to y_data and train your model! 😃
The 5 dimensions are (batch, frames, height, width, channels), I write a generator to produce such vector for 3D CNN training.