Can't use chainer.grad for LSTM
See original GitHub issueI tried to differentiate a LSTM block with chainer.grad
function along with enable_double_backprop=True
option. Then got the error:
Code:
import numpy as np
import chainer
import chainer.links as L
x = chainer.Variable(np.random.rand(10, 20).astype('f'))
lstm = L.LSTM(20, 20)
y = lstm(x)
dydx, = chainer.grad([y], [x], enable_double_backprop=True)
Error:
~/.pyenv/versions/anaconda3-5.0.0/lib/python3.6/site-packages/chainer/functions/activation/lstm.py in backward(self, indexes, grads)
111 def backward(self, indexes, grads):
112 grad_inputs = (
--> 113 self.get_retained_inputs() + self.get_retained_outputs() + grads)
114 return LSTMGrad()(*grad_inputs)
Is this a bug of LSTM module or my wrong usage of LSTM? I’m using Chainer v4.0.0b3.
Any comments will help. Thanks.
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Chainer Documentation
Chainer is a powerful, flexible and intuitive deep learning framework. • Chainer supports CUDA computation. It only requires a few lines of code...
Read more >Gradient of the layers of a loaded neural network in Chainer
If you want to get .grad of the input image, you have to wrap the input by ... Use chainer.grad() to obtain .grad...
Read more >Why do we need both cell state and hidden value in LSTM ...
Regarding question (2), vanishing/exploding gradients happen in LSTMs too. In vanilla RNNs, the gradient is a term that depends on a factor ...
Read more >Understanding Gradient Clipping (and How It Can Fix ...
So if we take the derivative with respect to W we can't simply treat a<3> as constant. We need to apply the chain...
Read more >Long Short-Term Memory: From Zero to Hero with PyTorch
RNNs are unable to remember information from much earlier. However, due to the short-term memory, the typical RNN will only be able to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
seems to work if the lines
are added to
LSTMGrad.backward
.Sorry,
should be
It seems that leaving
lstm.h.grad
to beNone
is fine: