Stacking multiple LSTM layers yields an error
See original GitHub issueI tried to create a network with multiple LSTM layers. No matter what I try this or similar attempts will yield an error:
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.layers.embeddings import Embedding
from keras.layers.recurrent import LSTM
nb_chars = 26 # a..z
nb_nodes = 50
model = Sequential()
model.add(Embedding(nb_chars, nb_chars))
model.add(LSTM(nb_chars, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(LSTM(nb_nodes, nb_nodes, activation='sigmoid', inner_activation='hard_sigmoid'))
model.add(Dropout(0.5))
model.add(Dense(nb_nodes, nb_chars))
model.add(Activation('sigmoid'))
model.add(Dropout(0.5))
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
Error:
Traceback (most recent call last):
File "test.py", line 21, in <module>
model.compile(loss='binary_crossentropy', optimizer='rmsprop')
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 71, in compile
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/models.py", line 155, in get_output
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 140, in get_output
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 233, in get_output
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 115, in get_output
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/core.py", line 27, in get_input
File "/home/tener/.local/lib/python3.4/site-packages/Keras-0.0.1-py3.4.egg/keras/layers/recurrent.py", line 338, in get_output
File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/var.py", line 341, in dimshuffle
pattern)
File "/home/tener/.local/lib/python3.4/site-packages/Theano-0.7.0-py3.4.egg/theano/tensor/elemwise.py", line 141, in __init__
(i, j, len(input_broadcastable)))
ValueError: new_order[2] is 2, but the input only has 2 axes.
If I replace one of LSTM with, say, Dense, it will work. I cannot figure out why - according to the documentation the inputs and outputs of both should match?
Issue Analytics
- State:
- Created 8 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Tensorflow Keras - Error while stacking LSTM layers
Adding additional LSTMs in the mix yields the following error which I cannot really understand. I'm using python 3.7.3 on Linux Ubuntu x64....
Read more >How to stack multiple LSTMs in keras?
The solution is to add return_sequences=True to all LSTM layers except the last one so that its output tensor has ndim=3 (i.e. batch...
Read more >Stacked Long Short-Term Memory Networks
The Stacked LSTM is an extension to this model that has multiple hidden LSTM layers where each layer contains multiple memory cells.
Read more >Predicting machine's performance record using the stacked ...
In this study, networks with two LSTM layers were investigated. The loss value was evaluated by the root-mean-square error (RMSE).
Read more >LSTM part 2 - Stateful and Stacking - YouTube
Here we discuss how to stack LSTMs and what Stateful LSTMs are. Again we uses Keras Deep Learning Library. Note that I can...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
A LSTM layer, as per the docs, will return the last vector by default rather than the entire sequence. In order to return the entire sequence (which is necessary to be able to stack LSTM), use the constructor argument
return_sequences=True
.LSTM layers have to have input in the size [samples, timesteps, features]