question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Stacking multiple LSTM layers yields an error

See original GitHub issue

I tried to create a network with multiple LSTM layers. No matter what I try this or similar attempts will yield an error:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM

nb_chars = 26 # a..z
nb_nodes = 50

model = Sequential()
model.add(Embedding(nb_chars, nb_chars))
model.add(LSTM(nb_chars, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(LSTM(nb_nodes, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(nb_nodes, nb_chars))
model.add(Activation('sigmoid'))
model.add(Dropout(0.5))

model.compile(loss='binary_crossentropy', optimizer='rmsprop')

Error:

Traceback (most recent call last):
  File "test.py", line 21, in <module>
    model.compile(loss='binary_crossentropy', optimizer='rmsprop')
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 71, in compile
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 155, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 140, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 233, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
  File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/recurrent.py", line 338, in get_output
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/var.py", line 341, in dimshuffle
    pattern)
  File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/elemwise.py", line 141, in __init__
    (i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.

If I replace one of LSTM with, say, Dense, it will work. I cannot figure out why - according to the documentation the inputs and outputs of both should match?

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

17reactions
fcholletcommented, May 26, 2015

A LSTM layer, as per the docs, will return the last vector by default rather than the entire sequence. In order to return the entire sequence (which is necessary to be able to stack LSTM), use the constructor argument return_sequences=True.

0reactions
StuartFarmercommented, Sep 27, 2016

LSTM layers have to have input in the size [samples, timesteps, features]

Read more comments on GitHub >

github_iconTop Results From Across the Web

Tensorflow Keras - Error while stacking LSTM layers
Adding additional LSTMs in the mix yields the following error which I cannot really understand. I'm using python 3.7.3 on Linux Ubuntu x64....
Read more >
How to stack multiple LSTMs in keras?
The solution is to add return_sequences=True to all LSTM layers except the last one so that its output tensor has ndim=3 (i.e. batch...
Read more >
Stacked Long Short-Term Memory Networks
The Stacked LSTM is an extension to this model that has multiple hidden LSTM layers where each layer contains multiple memory cells.
Read more >
Predicting machine's performance record using the stacked ...
In this study, networks with two LSTM layers were investigated. The loss value was evaluated by the root-mean-square error (RMSE).
Read more >
LSTM part 2 - Stateful and Stacking - YouTube
Here we discuss how to stack LSTMs and what Stateful LSTMs are. Again we uses Keras Deep Learning Library. Note that I can...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found