question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Bidirectional Wrapper: regularization is not applied to reverse network

See original GitHub issue

The regularization specified in the bias_regularizer, kernel_regularizer and recurrent_regularizer parameters in the definition of the GRU unit wrapped by the Bidirectional wrapper appear not to be applied to the reverse layer. Here is my definition of such a layer:

model.add(Bidirectional(GRU(hidden_size, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', return_sequences=True, bias_regularizer=l2(l=B_REG), kernel_regularizer=l2(l=W_REG), recurrent_regularizer=l2(l=W_REG), dropout=DROPOUT, recurrent_dropout=DROPOUT, implementation=2, unroll=False), merge_mode='concat', input_shape=(None, input_size)))

Below is a plot of the distribution of weights (The vertical line extends from +1σ to -1σ) as a function of epoch for training a model where the recurrent_regularizer was zero but the bias_regularizer and kernel_regularizer were non-zero. The effect of regularization can clearly be seen in the input weights and biases for the forward layer but not for the reverse layer.

image

  • Check that you are up-to-date with the master branch of Keras. You can update with: pip install git+git://github.com/fchollet/keras.git --upgrade --no-deps

  • If running on TensorFlow, check that you are up-to-date with the latest version. The installation instructions can be found here.

  • If running on Theano, check that you are up-to-date with the master branch of Theano. You can update with: pip install git+git://github.com/Theano/Theano.git --upgrade --no-deps

  • Provide a link to a GitHub Gist of a Python script that can reproduce your issue (or just copy the script here if it is short).

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:3
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
drspiffycommented, Jun 10, 2018

Thanks for the reminder

1reaction
reidjohnsoncommented, Jun 9, 2018

Just a note that this issue can be closed. Pull request #10012 fixes the bug and has been merged.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Develop a Bidirectional LSTM For Sequence ...
The first on the input sequence as-is and the second on a reversed copy of the input sequence. This can provide additional context...
Read more >
tf.keras.layers.Bidirectional | TensorFlow v2.11.0
If backward_layer is not provided, the layer instance passed as the layer argument will be used to generate the backward layer automatically.
Read more >
tf.keras.layers.Bidirectional | TensorFlow
This method is the reverse of get_config , capable of instantiating the same layer from the config dictionary. It does not handle layer...
Read more >
Understanding Bidirectional RNN in PyTorch | by Ceshine Lee
This structure allows the networks to have both backward and forward information about the sequence at every time step. The concept seems easy ......
Read more >
Advanced Use of Recurrent Neural Networks: Part 8
A bidirectional RNN exploits the order sensitivity of RNNs: it consists of using two regular RNNs, such as the GRU and LSTM layers...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found