Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add dropout and recurrent_dropout to CuDNNLSTM and CuDNNGRU

See original GitHub issue

Native Keras GRU and LSTM layers support dropout and recurrent_dropout, but their CuDNN-accelerated counterparts, CuDNNLSTM and CuDNNGRU, do not. It might be good to add these features. Although CuDNN RNNs do not support dropout natively, it seems to be possible to implement it outside of CuDNN. At least TensorFlow is capable of that. In Keras dropout can be applied either on inputs (dropout), which should be straightforward, or on previous hidden state (recurrent_dropout). I’m not sure if the latter might be possible, tough.

The reason is using CuDNN RNN implementation for fast training and allow dropout regularization at the same time.

Please comment if this makes sense or it is wanted. I’d be happy to try implementing that. Thanks.

Issue Analytics

State:
Created 6 years ago
Reactions:38
Comments:30 (3 by maintainers)

Top GitHub Comments

10reactions

fcholletcommented, Feb 14, 2018

Recurrent dropout is not implemented in cuDNN RNN ops. At the cuDNN level. So we can’t have it in Keras.

The dropout option in the cuDNN API is not recurrent dropout (unlike what is in Keras), so it is basically useless (regular dropout doesn’t work with RNNs).

Actually using such dropout in a stacked RNN will wreck training.

9reactions

ml-picklecommented, Jan 3, 2018

This was would be tremendously helpful to many, many people. Not being able to use dropout often renders CuDNN layers virtually useless for training smaller datasets.

Top Results From Across the Web

How to add recurrent dropout to CuDNNGRU or CuDNNLSTM ...

CuDNNLSTM and CuDNNGRU are LSTM and GRU layers that are compatible with CUDA. The main advantage is that they are 10 times faster...

tf.compat.v1.keras.layers.CuDNNLSTM | TensorFlow v2.11.0

Initializer for the recurrent_kernel weights matrix, used for the linear transformation of the recurrent state.

unknown layer cudnnlstm, cudnngru on cpu, no opkernel was ...

However they lack some of the beauty of the LSTM or GRU layers in Keras, namely the possibility to pass dropout or recurrent...

Python Examples of keras.layers.CuDNNLSTM

def create(inputtokens, vocabsize, units=16, dropout=0, embedding=32): input_ ... CuDNNLSTM elif self.rnn_type=="CuDNNGRU": layer_cell = CuDNNGRU else: ...

Recurrent layers - Keras

Keras API reference / Layers API / Recurrent layers. Recurrent layers. LSTM layer · GRU layer · SimpleRNN layer · TimeDistributed layer ·...