Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Stacking multiple LSTM layers yields an error

See original GitHub issue

I tried to create a network with multiple LSTM layers. No matter what I try this or similar attempts will yield an error:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM

nb_chars = 26 # a..z
nb_nodes = 50

model = Sequential()
model.add(Embedding(nb_chars, nb_chars))
model.add(LSTM(nb_chars, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(LSTM(nb_nodes, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(nb_nodes, nb_chars))
model.add(Activation('sigmoid'))
model.add(Dropout(0.5))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')

Error:

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    model.compile(loss='binary_crossentropy', optimizer='rmsprop')
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 71, in compile
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 155, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 140, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 233, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/recurrent.py", line 338, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/var.py", line 341, in dimshuffle
    pattern)
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/elemwise.py", line 141, in __init__
    (i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.

If I replace one of LSTM with, say, Dense, it will work. I cannot figure out why - according to the documentation the inputs and outputs of both should match?

Issue Analytics

State:
Created 8 years ago
Comments:6 (2 by maintainers)

Top GitHub Comments

17reactions

fcholletcommented, May 26, 2015

A LSTM layer, as per the docs, will return the last vector by default rather than the entire sequence. In order to return the entire sequence (which is necessary to be able to stack LSTM), use the constructor argument return_sequences=True.

0reactions

StuartFarmercommented, Sep 27, 2016

LSTM layers have to have input in the size [samples, timesteps, features]