question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

When Dropout layer is shared Siamese-style, dropped units are not synchronized.

See original GitHub issue

When sharing Dropout layers Siamese-style, I wasn’t able to synchronize dropped units. For example, in the code below, noise_shape has no effect. seed parameter has no effect either. Shared Dropout layers should synchronize dropped units by default, otherwise they are not shared in any meaningful way.

# https://gist.github.com/ozabluda/bbc6c84c0e69bfd9ca55170fd3ab040d
# https://github.com/keras-team/keras/issues/8802

from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Input, Lambda
from keras import backend as K
import numpy as np

m = Sequential([
    Dropout(rate=0.5, input_shape=(1,), noise_shape=(1,1))
])
m.summary()

input_a = Input(shape=(1,))
input_b = Input(shape=(1,))

processed_a = m(input_a)
processed_b = m(input_b)

def l1_distance((x1, x2)):
    return K.sum(K.abs(x1-x2), axis=1)

c = Lambda(l1_distance, output_shape=(1,))([processed_a, processed_b])
s = Model([input_a, input_b], c)
s.compile(optimizer='sgd', loss='mse')
s.summary()

x0 = np.array([1])
x1 = np.array([1])
x  = [x0,x1]
y  = np.array([0])

s.fit(x, y, verbose=1, epochs=10)

print(s.evaluate(x,y), s.predict(x))

Output:

Epoch 1/10
1/1 [==============================] - 1s 1s/step - loss: 0.0000e+00
Epoch 2/10
1/1 [==============================] - 0s 3ms/step - loss: 4.0000
Epoch 3/10
1/1 [==============================] - 0s 3ms/step - loss: 4.0000
Epoch 4/10
1/1 [==============================] - 0s 4ms/step - loss: 0.0000e+00
Epoch 5/10
1/1 [==============================] - 0s 3ms/step - loss: 0.0000e+00
Epoch 6/10
1/1 [==============================] - 0s 3ms/step - loss: 0.0000e+00
Epoch 7/10
1/1 [==============================] - 0s 3ms/step - loss: 4.0000
Epoch 8/10
1/1 [==============================] - 0s 3ms/step - loss: 4.0000
Epoch 9/10
1/1 [==============================] - 0s 3ms/step - loss: 4.0000
Epoch 10/10
1/1 [==============================] - 0s 4ms/step - loss: 0.0000e+00
1/1 [==============================] - 0s 14ms/step
(0.0, array([ 0.], dtype=float32))

Note that

  1. during inference nothing is dropped, so loss is 0, as expected
  2. if dropout rate=0, the loss is 0 during training, as expected
  3. Note that on Epochs 2,3,7,8,9 the loss is 4, which I don’t understand at all. Maybe another bug, I need to investigate further.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
hoangcuong2011commented, May 9, 2019

OK so I implemented a shared Dropout layer by myself, similar to what @fchollet suggested.

class SharedDropout(Layer):
	# learnt this from this link: https://github.com/tensorflow/tensorflow/blob/r1.13/tensorflow/python/ops/nn_ops.py
	def __init__(self, keep_prob_rate=0.5, **kwargs):
		self.keep_prob_rate = keep_prob_rate
		super(SharedDropout, self).__init__(**kwargs)

	def build(self, input_shape):
		super(SharedDropout, self).build(input_shape)

	def call(self, inputs):
		input_left = inputs[0]
		input_right = inputs[1]
		random_tensor = self.keep_prob_rate
		random_tensor += tf.random_uniform(tf.shape(input_left), dtype=input_left.dtype)
		binary_tensor = tf.floor(random_tensor)

		def DropoutLeft():
			ret_left = tf.divide(input_left, self.keep_prob_rate) * binary_tensor
			return ret_left

		def DropoutRight():
			ret_right = tf.divide(input_right, self.keep_prob_rate) * binary_tensor
			return ret_right

		return [K.in_train_phase(DropoutLeft, input_left, training=None), K.in_train_phase(DropoutRight, input_right, training=None)]

	def compute_output_shape(self, input_shapes):
		return [(input_shapes[0][0], input_shapes[0][1]), (input_shapes[1][0], input_shapes[1][1])]

Example to call the layer:

x_input_left = Input(shape=(10,), name='x_input_left')

x_input_right = Input(shape=(10,), name='x_input_right')

shared_dropout = SharedDropout()

x_left, x_right = shared_dropout([x_input_left, x_input_right])

It took me a few hours to implement/test this. So I am proud of this for a moment 😄

0reactions
hoangcuong2011commented, May 9, 2019

I think the issue @ozabluda posted is very good/important. If we have a Siamese network, there is a possibility that it is risky to apply certain techniques like Dropout and Batch Normalization. I personally tried this in a Siamese network and I noticed the performance was very bad!

Do you have any idea how to improve this yet @ozabluda? Thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dropout layer - Keras
The Dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent...
Read more >
What if all the nodes are dropped when using dropout?
The units don't "disappear" when dropped, they just take on the value zero, which from the perspective of the other layers in the...
Read more >
Neural networks [7.5] : Deep learning - dropout - YouTube
Your browser can't play this video. Learn more. Switch camera.
Read more >
A Gentle Introduction to Dropout for Regularizing Deep Neural ...
It is not used on the output layer. The term “dropout” refers to dropping out units (hidden and visible) in a neural network....
Read more >
Dropout in Neural Networks - Towards Data Science
In the original implementation of the dropout layer, during training, a unit (node/neuron) in a layer is selected with a keep probability (1-drop...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found