Bidirectional Wrapper: regularization is not applied to reverse network
See original GitHub issueThe regularization specified in the bias_regularizer, kernel_regularizer and recurrent_regularizer parameters in the definition of the GRU unit wrapped by the Bidirectional wrapper appear not to be applied to the reverse layer. Here is my definition of such a layer:
model.add(Bidirectional(GRU(hidden_size, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', return_sequences=True, bias_regularizer=l2(l=B_REG), kernel_regularizer=l2(l=W_REG), recurrent_regularizer=l2(l=W_REG), dropout=DROPOUT, recurrent_dropout=DROPOUT, implementation=2, unroll=False), merge_mode='concat', input_shape=(None, input_size)))
Below is a plot of the distribution of weights (The vertical line extends from +1σ to -1σ) as a function of epoch for training a model where the recurrent_regularizer was zero but the bias_regularizer and kernel_regularizer were non-zero. The effect of regularization can clearly be seen in the input weights and biases for the forward layer but not for the reverse layer.
-
Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps
-
If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.
-
If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps
-
Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).
Issue Analytics
- State:
- Created 6 years ago
- Reactions:3
- Comments:7 (1 by maintainers)
Thanks for the reminder
Just a note that this issue can be closed. Pull request #10012 fixes the bug and has been merged.