Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

First Batch Normalization layer in a Keras-generated model appears to be connected to everything.

See original GitHub issue

TensorBoard version 1.10.0
TensorFlow version 1.10.0
OS Platform and version Windows 10 64bit and CentOS 6.9
Python version 2.7 and 3.6
For browser-related issues: n/a

When using tensorflow.keras to create a model, the first batchnormalization layer appears to be connected to all other batch normalization layers in the graph. I think this is rendered incorrectly rather than built incorrectly but have not been able to prove that.

Code follows that builds the same model with pure tensorflow, and with tensorflow.keras, as well as the graph rendered by tensorboard in each case.

This issue is probably related to this unanswered StackOverflow post: https://stackoverflow.com/questions/52586853/batchnormalization-nodes-wrongfully-linked-with-each-other and possibly related to this tensorflow issue: https://github.com/tensorflow/tensorflow/issues/17985

Graph produced by pure tensorflow

Graph produced with keras model kerasgraph

Tensorflow code

import tensorflow as tf
import numpy as np 

logdir="usingtf"
num_classes=10

x = tf.placeholder(tf.float32, shape=[None, 28,28,1], name="data_in") 
y = tf.placeholder(tf.int32, shape=[None, num_classes], name="target_labels") 

conv_1    =tf.layers.conv2d(inputs=x,filters=32,kernel_size=(3,3),name="Conv1")
bn_1      =tf.layers.batch_normalization(inputs=conv_1)
rl_1      =tf.nn.relu(bn_1)
conv_2    =tf.layers.conv2d(inputs=rl_1,filters=64,kernel_size=(3,3),name="Conv2")
bn_2      =tf.layers.batch_normalization(inputs=conv_2)
rl_2      =tf.nn.relu(bn_2)
maxpool_1 =tf.layers.max_pooling2d(inputs=rl_2,pool_size=2,strides=2,name="Pool1")
dropout_1 =tf.layers.dropout(inputs=maxpool_1,rate=0.25,name="Drop1")
flatten_1 =tf.layers.flatten(dropout_1)
dense_1   =tf.layers.dense(inputs=flatten_1,units=128,activation=tf.nn.relu,name="Dense1")
bn_3      =tf.layers.batch_normalization(inputs=dense_1)
rl_3      =tf.nn.relu(bn_3)
dropout_2 =tf.layers.dropout(rl_3,rate=0.5,name="Drop2")
dense_2   =tf.layers.dense(dropout_2,units=num_classes,name="Final") 

with tf.Session() as sess:
  tbwriter=tf.summary.FileWriter(logdir)
  tbwriter.add_graph(sess.graph)

Keras equivalent

import tensorflow as tf
import tensorflow.keras as keras
import numpy as np 
import tensorflow.keras.backend as K
sess=tf.Session()
K.set_session(sess)

logdir="usingkeras"
num_classes=10

model=keras.models.Sequential()

model.add(keras.layers.Conv2D(input_shape=(28,28,1),filters=32,kernel_size=(3,3),name="Conv1"))
#conv_1    =tf.layers.conv2d(inputs=x,filters=32,kernel_size=(3,3),name="Conv1")
model.add(keras.layers.BatchNormalization(name="FirstBatchnorm"))
#bn_1      =tf.layers.batch_normalization(inputs=conv_1)
model.add(keras.layers.Activation("relu"))
#rl_1      =tf.nn.relu(bn_1)
model.add(keras.layers.Conv2D(filters=64,kernel_size=(3,3),name="Conv2"))
#conv_2    =tf.layers.conv2d(inputs=rl_1,filters=64,kernel_size=(3,3),name="Conv2")
model.add(keras.layers.BatchNormalization())
#bn_2      =tf.layers.batch_normalization(inputs=conv_2)
model.add(keras.layers.Activation("relu"))
#rl_2      =tf.nn.relu(bn_2)
model.add(keras.layers.MaxPooling2D(pool_size=2,strides=2,name="Pool1"))
#maxpool_1 =tf.layers.max_pooling2d(inputs=rl_2,pool_size=2,strides=2,name="Pool1")
model.add(keras.layers.Dropout(0.25))
#dropout_1 =tf.layers.dropout(inputs=maxpool_1,rate=0.25,name="Drop1")
#flatten_1 =tf.layers.flatten(dropout_1)
model.add(keras.layers.Dense(units=128,activation="relu",name="Dense1"))
#dense_1   =tf.layers.dense(inputs=flatten_1,units=128,activation=tf.nn.relu,name="Dense1")
model.add(keras.layers.BatchNormalization())
#bn_3      =tf.layers.batch_normalization(inputs=dense_1)
model.add(keras.layers.Activation("relu"))
#rl_3      =tf.nn.relu(bn_3)
model.add(keras.layers.Dropout(0.25))
#dropout_2 =tf.layers.dropout(rl_3,rate=0.5,name="Drop2")
model.add(keras.layers.Dense(units=num_classes,name="Dense2"))
#dense_2   =tf.layers.dense(dropout_2,units=num_classes,name="Final") 
model.compile("adam","categorical_crossentropy")

tbwriter=tf.summary.FileWriter(logdir)
tbwriter.add_graph(sess.graph)
model.summary()

pinging @nuance-research

Issue Analytics

State:
Created 5 years ago
Comments:5

Top GitHub Comments

15reactions

cyberillithidcommented, Oct 19, 2018

Just encountered the same problem today, and after some juggling I think I have found the true root of this behaviour.

As far as I understand it, it’s not per se wrongfully linked; it looks like it’s about intrinsic optimizations or such.

Keras uses Batch Normalization (and perhaps Dropout) layer realizations that depend on flag learning_phase (a boolean value)—because they should work differently while fitting and while evaluating. And it looks like this flag is stored as input in the first layer that uses it; and even if we set it to false manually (e.g., calling keras.backend.set_learning_phase(0)), it is still used for some primary flag calculations, that are propagated from there to all of other layers of similar internal structure.

You could see that, I guess, just by observing the structure of batch norm layer.

True question is, how do we disable that flag propagation? Is it even possible to at least make it reducable by optimization (i.e., for manual setting learning_phase to zero—in that case the input to that keras_learning_phase node is static—and the flags that are propagated could be static as well)?

Or, to formulate the question to be more related to this repo—is it possible for TensorBoard to automatically hide these Keras-specific autogenerated links that are unrelated to the actual logic of the graph?

1reaction

kpecommented, Apr 16, 2020

I also have the problem, that using keras BatchNormalization produces 4 nodes in the Functions graph in tensorboard for each BatchNormalization. Having a keras model with just a few BatchNormalization layers results in Tensorboard Graphs being extremely slow and buggy - I suspect because of all the Functions nodes the rendering takes too long.

It would be nice if the Functions graph is hidden per default.

For example this:


input_shape = (128, 128, 1)

inp = tf.keras.layers.Input(input_shape)
bn = tf.keras.layers.BatchNormalization()
out = bn(inp)
model = tf.keras.models.Model(inputs=inp, outputs=out)
model.compile(optimizer="Adam", loss="mse")

output_shape = out.shape
feature_batch = np.zeros([1] + list(input_shape))
label_batch = np.zeros([1] + list(output_shape)[1:])
log_dir = "/tmp/tensorboard_model/"
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir)
model.fit(feature_batch, label_batch, callbacks=[tensorboard_callback])

results in

Top Results From Across the Web

BatchNormalization layer - Keras

Layer that normalizes its inputs. Batch normalization applies a transformation that maintains the mean output close to 0 and the output standard deviation...

Batch normalization in 3 levels of understanding

BN relies on batch first and second statistical moments (mean and variance) to normalize hidden layers activations. The output values are then ...

8.5. Batch Normalization - Dive into Deep Learning

Batch normalization is applied to individual layers, or optionally, to all of them: In each training iteration, we first normalize the inputs (of...

Batch Normalization in Keras - An Example – Weights & Biases

Implementing Batch Normalization in a Keras model and observing the effect of changing batch sizes, learning rates and dropout on model ...

Ordering of batch normalization and dropout? - Stack Overflow

So the Batch Normalization Layer is actually inserted right after a Conv Layer/Fully Connected Layer, but before feeding into ReLu (or any other...

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

First Batch Normalization layer in a Keras-generated model appears to be connected to everything.

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

ImportError: cannot import name '_message' from 'google.protobuf.pyext' (/usr/lib/python3.7/site-packages/google/protobuf/pyext/init.py)

What-If Tool: thresholded inference problem (confusion matrix/ROC)

First Batch Normalization layer in a Keras-generated model appears to be connected to everything.

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

ImportError: cannot import name '_message' from 'google.protobuf.pyext' (/usr/lib/python3.7/site-packages/google/protobuf/pyext/__init__.py)

What-If Tool: thresholded inference problem (confusion matrix/ROC)

ImportError: cannot import name '_message' from 'google.protobuf.pyext' (/usr/lib/python3.7/site-packages/google/protobuf/pyext/init.py)