question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Setting dropout rate via layer.rate doesn't work

See original GitHub issue

Hello there,

suppose you’ve defined a Keras Model with the functional API, and you want to change the dropout rate of the Dropout layers after you’ve instantiated the Model. How do you do this?


I’ve tried to do the following:

from keras.layers import Dropout
for layer in model.layers:
    if isinstance(layer, Dropout):
        layer.rate = 0.0
        print layer.get_config()

Based on the updated config of the Dropout layers, this should work:

{'noise_shape': None, 'rate': 0.2, 'trainable': True, 'seed': None, 'name': 'dropout_1'} -> {'noise_shape': None, 'rate': 0.0, 'trainable': True, 'seed': None, 'name': 'dropout_1'}

However, I can tell you that this does not work: during training, the old dropout values are still used. I’ve also tried to compile the model again after the layer loop (model.compile()) or even make a new model (model = Model(inputs=model.input, outputs=model.output)), but the problem still persists.


This issue can be easily tested with a VGG-like CNN with dropout layers and a small data sample (e.g. 100 images): just try to overfit the data. If you instantiate the net with a dropout rate of e.g. 0.2, the model will have a hard time to overfit the small data sample. Using the above code snippet, which should set the dropout rate to 0, will not change anything. However, if you directly instantiate the net with a dropout rate of 0.0, it will immediately overfit on the data sample.

Thus, it can be figured out that layer.rate changes the Dropout rate in the layer config, but somehow still the old dropout rate is used during training.


I’ve also tried to take a look into the Dropout layer sources. The only thing I can think of is that maybe the __init__ of the Dropout layers is not called again after changing the rate, such that the old dropout rate is used in call:

    def __init__(self, rate, noise_shape=None, seed=None, **kwargs):
        super(Dropout, self).__init__(**kwargs)
        self.rate = min(1., max(0., rate))
        self.noise_shape = noise_shape
        self.seed = seed

But this is just a guess. I’m using Keras 2.1.2 with tensorflow backend.

Does anyone have an idea? Thanks a lot!

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:2
  • Comments:16

github_iconTop GitHub Comments

2reactions
civilinformercommented, Jun 12, 2018

Fixed formatting.

2reactions
mparientecommented, Dec 20, 2017

Here is a sample code which checks if the rate is changed

import numpy as np
from keras import backend as K
from keras.layers import Dropout

dummy_input = np.ones((5,5))

K.set_learning_phase(1)
dropout_test = Dropout(0.3)
out_1 = dropout_test.call(dummy_input)
K.eval(out_1)

dropout_test.rate = 0.5
out_2 = dropout_test.call(dummy_input)
K.eval(out_2)

You can see that the dropout rate is different from the outputs.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Dropout makes performance worse - Cross Validated
When training time is limited. It's unclear if this is the case here, but if you don't train until convergence, dropout may give...
Read more >
A Gentle Introduction to Dropout for Regularizing Deep Neural ...
A good rule of thumb is to divide the number of nodes in the layer before dropout by the proposed dropout rate and...
Read more >
Dropout rate guidance for hidden layers in a convolution ...
It has convolution layers and then 6 hidden layers. All of the guidance online mentions a dropout rate of ~50%. I am about...
Read more >
Dropout in Neural Networks - Towards Data Science
How does the dropout layer work internally? ... be larger than expected and to deal with this problem, weights are first scaled by...
Read more >
How does dropout help to avoid overfitting in neural networks?
This is done by setting the kernel_constraint argument on the Dense class when constructing the layers. The learning rate was lifted by one ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found