Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Stacking Convolutions and LSTM

See original GitHub issue

I would like to stack 2D convolutions and LSTM layers, exactly the same problem as in #129

The proposed solution in #129 is a custom reshape layer By today, there is a built-in reshape layer in Keras. Searching the problem on Stackoverflow brings up a similar question, the accepted answer suggests using the built-in layer.

As a toy example, I would like to classify MNIST with a combination of Conv-Layers and an LSTM. I’ve sliced the images into four parts and arranged those parts into sequences. Then I’ve stacked the sequences. My training data is a numpy array with the shape [60000, 4, 1, 56, 14] where

60000 is the number of samples
4 is the number of timesteps
1 is number of colors, I’m using Theano layout for the image
56 and 14 are width and height

Please note: One of the image-slices has the size 14x14 since I’ve cut the 28x28 image in four parts. I get a 56 in the shape, because I’ve created 4 different sequences and stacked them along this axis.

Here is my code so far:

nb_filters=32
kernel_size=(3,3)
pool_size=(2,2)
nb_classes=10
batch_size=64

model=Sequential()

model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode="valid", input_shape=[1, 56,14]))
model.add(Activation("relu"))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=pool_size))


model.add(Reshape((56*14,)))
model.add(Dropout(0.25))
model.add(LSTM(5))
model.add(Dense(50))
model.add(Dense(nb_classes))
model.add(Activation("softmax"))

When run, the Reshape layer raises a ValueError:

ValueError: total size of new array must be unchanged

I also tried to pass the number of timesteps to the Reshape layer: model.add(Reshape((4, 56*14))) But that doesn’t solve the problem either.

What is the correct dimension to give to the Reshape layer ? Is a Reshape layer the correct solution at all ?

I’ve posted the same question on stackoverflow.

Issue Analytics

State:
Created 7 years ago
Comments:18 (2 by maintainers)

Top GitHub Comments

5reactions

kgrmcommented, Oct 24, 2016

If I understand you correctly, you want to feed the output of the Convolutional layer, with no time sequence information, to an LSTM layer. The way to do this is to divide it into time steps, either in batches or one element at a time. Since your feature map has a total dimensionality of 32*26*5=4160, you could do this, for example, by treating every element of it as a time step in a sequence. To do this, you should reshape it with Reshape((4160,1)). To sequence N elements at a time, use Reshape((4160/N, N)), where N is an integer divisor of 4160.

2reactions

lhkcommented, Oct 24, 2016

Amazing, this seems to do the trick.

I wrapped everything up to the LSTM in the TimeDistributed layer and provided the number of time_steps as an additional input dimension. This runs without problems.

Am I doing this the right way ?

nb_filters=32
kernel_size=(3,3)
pool_size=(2,2)
nb_classes=10
batch_size=64

model=Sequential()

model.add(TimeDistributed(
    Convolution2D(
        nb_filters, kernel_size[0], kernel_size[1], border_mode="valid"), input_shape=[4, 1, 56,14]))
model.add(TimeDistributed(Activation("relu")))
model.add(TimeDistributed(Convolution2D(nb_filters, kernel_size[0], kernel_size[1])))
model.add(TimeDistributed(Activation("relu")))
model.add(TimeDistributed(MaxPooling2D(pool_size=pool_size)))
model.add(TimeDistributed(Flatten()))
model.add(TimeDistributed(Dropout(0.25)))
model.add(LSTM(5))
model.add(Dense(50))
model.add(Dense(nb_classes))
model.add(Activation("softmax"))