question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to change regularization parameters during training?

See original GitHub issue

Hi all,

I am trying to implement flexible regularization scheduler. Instantiate layer like this: x = Convolution2D(... W_regularizer=l2(10)...)

and later change regularization: model.layers[1].W_regularizer = l2(0)

I can verify the layer’s settings changed:

model.layers[1].W_regularizer.l2
Out[9]: array(0.0, dtype=float32)

but this has no effect during following training whether I compile model anew or not. Where is a caveat?

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:2
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

8reactions
bstrinercommented, Dec 24, 2016

Hi Alexander,

The hyperparameters are built-in to the training function when you compile. Editing the model after compilation won’t do anything to affect your current training. You will see the same issue if you try to change learning rates or other hyperparameters.

The way to modify hyperparameters during training is to use backend variables in the training function and update those variables during training.

The L1L2Regularizer isn’t using variables but it should be. https://github.com/fchollet/keras/blob/master/keras/regularizers.py Change: self.l2 = K.cast_to_floatx(l2) to: self.l2 = K.variable(K.cast_to_floatx(l2))

Instantiate but hold a reference to the regularizer.

reg = l2(10)
x = Convolution2D(W_regularizer=reg)

During training, update the variable reg.l2 K.set_value(reg.l2, K.cast_to_floatx(100))

Might want to add a pull request to make l1l2 into variables.

Cheers, Ben

4reactions
marioviticommented, May 22, 2018

Hi,

I’ve been tacking inspiration from this conversation and came up with this solution that works extremely well:

1st: extend the Regularizer with a custom l1l2 regularizer class (do not call it L1L2 as in serialization ,aka when you save and reload your model, shadowing does not work): it should go something like this:

class L1L2_m(Regularizer):
    """Regularizer for L1 and L2 regularization.
    # Arguments
        l1: Float; L1 regularization factor.
        l2: Float; L2 regularization factor.
    """

    def __init__(self, l1=0.0, l2=0.01):
        with K.name_scope(self.__class__.__name__):
            self.l1 = K.variable(l1,name='l1')
            self.l2 = K.variable(l2,name='l2')
            self.val_l1 = l1
            self.val_l2 = l2
            
    def set_l1_l2(self,l1,l2):
        K.set_value(self.l1,l1)
        K.set_value(self.l2,l2)
        self.val_l1 = l1
        self.val_l2 = l2

    def __call__(self, x):
        regularization = 0.
        if self.val_l1 > 0.:
            regularization += K.sum(self.l1 * K.abs(x))
        if self.val_l2 > 0.:
            regularization += K.sum(self.l2 * K.square(x))
        return regularization

    def get_config(self):
        config = {'l1': float(K.get_value(self.l1)),
                  'l2': float(K.get_value(self.l2))}
        return config

2nd: Add your custom object so that when you might want to export your model you won’t have any issue in reloading it.

from keras.utils.generic_utils import get_custom_objects
get_custom_objects().update({ L1L2_m.__name__: L1L2_m })

3rd: update your variable using the custom object set_l1_l2 method by accessing the object from the model keras model.

def set_model_l1_l2(model,l1,l2):
    for layer in model.layers:
         if 'kernel_regularizer' in dir(layer) and \
                isinstance(layer.kernel_regularizer, L1L2_m):
                layer.kernel_regularizer.set_l1_l2(l1,l2)

Done.

But wait, can’t I access the variables from tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)? which is btw the same dictionary as model.trainable_variables.

Yes you could but I warmly suggest you not to do so:

Why? because at declaration time your variable scope will be depending on whether you’ve defined the L1L2_m within a layer (a convolutional layer for example Conv1). So if you were to look into the graph you’ll find your variables scope looks smth like: Conv1/L1L2/l1 … or Conv10/L1L2/l1 … But the keras deserializer does not works like that and you’ll find all L1L2_1/l1 , L1L2_2/l1 … all grouped together if you save and reload your model (json or h5 format).

Using the object reference method set_l1_l2 gives the same results every time even with a different graph representation.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Change keras regularizer during training / dynamic ...
We want modify hyperparameters during training, and the way to do it is use backend variables in the training function and update those ......
Read more >
Regularization: Machine Learning
Lambda's purpose is to give a good fit for the training data while limiting the values of the parameters, thereby keeping the hypothesis ......
Read more >
Training Parameters - Amazon Machine Learning
For information about the default model size, see Training Parameters: Types and Default Values. For more information about regularization, see Regularization.
Read more >
How to Add Regularization to Keras Pre-trained Models the ...
Although in practice this argument may sound right, there is an important catch in here. Even though you may be able to fit...
Read more >
How to Use Weight Decay to Reduce Overfitting of Neural ...
Weight Regularization in Keras; Examples of Weight Regularization ... We can see no change in the accuracy on the training dataset and an ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found