Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Keras or Tensorflow adds sveral layers automatically

See original GitHub issue

Hello.

I use keras tuner, random search. And tensorflow. But I do not think that random search is the problem:

For some reasons several layers get added automatically.


latenteVariable = 24 ##### IMPORTANT

class MyTuner(kerastuner.tuners.RandomSearch):

def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latenteVariable), mean=0.,
                              stddev=epsilon_std)
    return z_mean + K.exp(z_log_var / 2) * epsilon


def build_model(hp):
...
  h = Dense(units=hp.Int('units4',min_value=48,max_value=64,step=8),activation=activation)(h)
  h = BatchNormalization(name="encoder_norm_4")(h)
  schicht4 = hp.get('units4')

  z_mean = Dense(latenteVariable)(h)
  z_log_var = Dense(latenteVariable)(h) 
  z = Lambda(sampling, output_shape=(latenteVariable,))([z_mean, z_log_var])  ###### variable is used here

  b = Dense(units=schicht4,activation=activation)(z)
  b = BatchNormalization(name="decoder_norm_1")(b)

output:
__________________________________________________________________________________________________
encoder_norm_4 (BatchNormalizat (None, 48)           192         dense_3[0][0]
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 24)           1176        encoder_norm_4[0][0]
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 24)           1176        encoder_norm_4[0][0]
__________________________________________________________________________________________________
lambda (Lambda)                 (None, 24)           0           dense_4[0][0]
                                                                 dense_5[0][0]
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 48)           1200        lambda[0][0]
__________________________________________________________________________________________________

So above latenteVariable is a global variable.

Bellow latenteVariable is a local variable.

def sampling(args):
    z_mean, z_log_var, latenteVariable = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latenteVariable), mean=0.,
                              stddev=epsilon_std)
    return z_mean + K.exp(z_log_var / 2) * epsilon


def build_model(hp):

  h = Dense(units=hp.Int('units4',min_value=48,max_value=64,step=8),activation=activation)(h)
  h = BatchNormalization(name="encoder_norm_4")(h)
  schicht4 = hp.get('units4')

  latenteVariable = 24  ########## local variable
  z_mean = Dense(latenteVariable)(h)
  z_log_var = Dense(latenteVariable)(h)
  z = Lambda(sampling, output_shape=(latenteVariable,))([z_mean, z_log_var])

  b = Dense(units=schicht4,activation=activation)(z)
  b = BatchNormalization(name="decoder_norm_1")(b)

I get the result:

encoder_norm_4 (BatchNormalizat (None, 64)           256         dense_3[0][0]
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 24)           1560        encoder_norm_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_Shape (TensorFlowOp [(2,)]               0           dense_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [()]                 0           tf_op_layer_Shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_shape_1 (TensorFlow [(2,)]               0           tf_op_layer_strided_slice[0][0]
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 24)           1560        encoder_norm_4[0][0]
__________________________________________________________________________________________________
tf_op_layer_RandomStandardNorma [(None, 24)]         0           tf_op_layer_shape_1[0][0]
__________________________________________________________________________________________________
tf_op_layer_RealDiv (TensorFlow [(None, 24)]         0           dense_5[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul (TensorFlowOpLa [(None, 24)]         0           tf_op_layer_RandomStandardNormal[
__________________________________________________________________________________________________
tf_op_layer_Exp (TensorFlowOpLa [(None, 24)]         0           tf_op_layer_RealDiv[0][0]
__________________________________________________________________________________________________
tf_op_layer_Add (TensorFlowOpLa [(None, 24)]         0           tf_op_layer_Mul[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mul_1 (TensorFlowOp [(None, 24)]         0           tf_op_layer_Exp[0][0]
                                                                 tf_op_layer_Add[0][0]
__________________________________________________________________________________________________
tf_op_layer_AddV2 (TensorFlowOp [(None, 24)]         0           dense_4[0][0]
                                                                 tf_op_layer_Mul_1[0][0]
__________________________________________________________________________________________________
dense_6 (Dense)                 (None, 64)           1600        tf_op_layer_AddV2[0][0]

So in the first example I have three 24-layers. In the second example I have eight 24-layers. How can I use a local variable without getting eight layers automatically (instead of three)?

Thank you

Issue Analytics

State:
Created 2 years ago
Comments:6 (2 by maintainers)

Top GitHub Comments

1reaction

rafaelstevensoncommented, Apr 20, 2021

I always had the same issue. I think the summary reports the best values found so far for the entire architecture. This means that if the search algorithm tried to process a network with a higher number of layers than the final optimal one, it keeps the best values of all layers in memory. Therefore, you can ignore that output and take the neurons number for the number of layers depicted as the optimal one. If someone from developers can confirm this behaviour, it would be very appreciated!

I see, Yes i do hope explanations from the Developers. Thank you TheRed86

1reaction

TheRed86commented, Apr 20, 2021

I always had the same issue. I think the summary reports the best values found so far for the entire architecture. This means that if the search algorithm tried to process a network with a higher number of layers than the final optimal one, it keeps the best values of all layers in memory. Therefore, you can ignore that output and take the neurons number for the number of layers depicted as the optimal one. If someone from developers can confirm this behaviour, it would be very appreciated!

Top Results From Across the Web

The Sequential model | TensorFlow Core

A Sequential model is appropriate for a plain stack of layers where each layer has exactly one input tensor and one output tensor....

Making new Layers and Models via subclassing - TensorFlow

Keras will automatically pass the correct mask argument to __call__() for layers that support it, when a mask is generated by a prior...

tf.keras.layers.Layer | TensorFlow v2.11.0

Layers automatically cast inputs to this dtype which causes the computations and output to also be in this dtype. When mixed precision is...

Introduction to modules, layers, and models | TensorFlow Core

Keras layers come with an extra lifecycle step that allows you more flexibility in how you define your layers. This is defined in...

Multi-worker training with Keras | TensorFlow Core

MultiWorkerMirroredStrategy is your choice. It creates copies of all variables in the model's layers on each device across all workers. It uses CollectiveOps...