question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Batch norm update bug

See original GitHub issue

https://github.com/carpedm20/DCGAN-tensorflow/blob/60aa97b6db5d3cc1b62aec8948fd4c78af2059cd/model.py#L223

In here that you’re intending to just update the discriminator, but you are updating the batch_norm parameters of the generator too.

same holds for here: https://github.com/carpedm20/DCGAN-tensorflow/blob/60aa97b6db5d3cc1b62aec8948fd4c78af2059cd/model.py#L232

You are updating self.g_loss where inside it uses the discriminator weights that are instantiated with train=True.

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:6

github_iconTop GitHub Comments

2reactions
youngleeccommented, May 15, 2018

Oh I see. It may be a small bug. Because updates_collections=None force the updates for the moving mean and the moving variance in place. We may need to set updates_collections and add the dependency explicitly. Thanks a lot.

0reactions
etienne-vcommented, Oct 17, 2018

Hi @po0ya @youngleec . Do you have any solution or advice on the correct way to implement batch norm in this case? What I currently do is the following: I use placeholders for the training flags for when d and g are training:

d_istraining_ph = tf.placeholder(...)
g_istraining_ph = tf.placeholder(...)

When I build the generator and discriminator graph I pass the correct placeholders to d and g:

gz = generator(z, is_training=g_istraining_ph)
d_real = discriminator(x, is_training=d_istraining_ph)
d_fake = discriminator(gz, is_training=d_is_training_ph, reuse=False)

… where I’ve implemented the tf.layers.batch_normalization(...., training=is_training, ...) function inside the generator and discriminator models and the reuse flag is set to True to reuse the same variables for the second call to discriminator.

When building the training operations I collect the associated trainable variables:

d_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='discriminator')
g_vars = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope='generator')

where I’ve used tf.variable_scope('discriminator', reuse=reuse) and tf.variable_scope('generator') to wrap the discriminator and generator in scopes when I defined them.

The optimizers, gradient calculations and update steps are as follow:

d_opt = tf.train.AdamOptimizer(.....)
g_opt = tf.train.AdamOptimizer(.....)

d_grads_vars = d_opt.compute_gradients(d_loss, d_vars)
g_grads_vars = g_opt.compute_gradients(g_loss, g_vars)

with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS, scope='discriminator')):
    d_train = d_opt.apply_gradients(d_grads_vars)
with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS, scope='generator')):
    g_train = g_opt.apply_gradients(g_grads_vars)

When calling d_train during training, I set d_istraining=True and g_istraining=False, which will force the generator to use the running statistics but not so for the discriminator. In this case I suppose the discriminator’s batch norm statistics will be updated with both real data and fake data from the generator. When calling g_train during training I do the opposite by setting d_istraining=False and g_istraining=True.

Is my reasoning correct?

Read more comments on GitHub >

github_iconTop Results From Across the Web

BatchNorm problem with latest TCN · Issue #88 - GitHub
Hi,. After using the TCN code update 22 days back, all my code is not working. Please check the attached text files:.
Read more >
Curse of Batch Normalization - Towards Data Science
During training, the output distribution of each intermediate activation layer shifts at each iteration as we update the previous weights. This ...
Read more >
On The Perils of Batch Norm - Sorta Insightful
This post is written for deep learning practitioners, and assumes youknow what batch norm is and how it works.
Read more >
Implementing Batchnorm in Pytorch. Problem with updating ...
I'm trying to implement batch normalization in pytorch and apply it into VGG16 network. Here's my batchnorm below. class BatchNorm(nn.
Read more >
Problem with updating running_mean and running_var in a ...
The custom batchnorm works alright when using 1 GPU, but, when extended to 2 or more, the running mean and variance work in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found