Batch norm update bug
See original GitHub issueIn here that you’re intending to just update the discriminator, but you are updating the batch_norm parameters of the generator too.
same holds for here: https://github.com/carpedm20/DCGAN-tensorflow/blob/60aa97b6db5d3cc1b62aec8948fd4c78af2059cd/model.py#L232
You are updating self.g_loss where inside it uses the discriminator weights that are instantiated with train=True
.
Issue Analytics
- State:
- Created 5 years ago
- Comments:6
Top Results From Across the Web
BatchNorm problem with latest TCN · Issue #88 - GitHub
Hi,. After using the TCN code update 22 days back, all my code is not working. Please check the attached text files:.
Read more >Curse of Batch Normalization - Towards Data Science
During training, the output distribution of each intermediate activation layer shifts at each iteration as we update the previous weights. This ...
Read more >On The Perils of Batch Norm - Sorta Insightful
This post is written for deep learning practitioners, and assumes youknow what batch norm is and how it works.
Read more >Implementing Batchnorm in Pytorch. Problem with updating ...
I'm trying to implement batch normalization in pytorch and apply it into VGG16 network. Here's my batchnorm below. class BatchNorm(nn.
Read more >Problem with updating running_mean and running_var in a ...
The custom batchnorm works alright when using 1 GPU, but, when extended to 2 or more, the running mean and variance work in...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Oh I see. It may be a small bug. Because
updates_collections=None
force the updates for the moving mean and the moving variance in place. We may need to setupdates_collections
and add the dependency explicitly. Thanks a lot.Hi @po0ya @youngleec . Do you have any solution or advice on the correct way to implement batch norm in this case? What I currently do is the following: I use placeholders for the training flags for when d and g are training:
When I build the generator and discriminator graph I pass the correct placeholders to d and g:
… where I’ve implemented the
tf.layers.batch_normalization(...., training=is_training, ...)
function inside the generator and discriminator models and thereuse
flag is set toTrue
to reuse the same variables for the second call todiscriminator
.When building the training operations I collect the associated trainable variables:
where I’ve used
tf.variable_scope('discriminator', reuse=reuse)
andtf.variable_scope('generator')
to wrap the discriminator and generator in scopes when I defined them.The optimizers, gradient calculations and update steps are as follow:
When calling d_train during training, I set
d_istraining=True
andg_istraining=False
, which will force the generator to use the running statistics but not so for the discriminator. In this case I suppose the discriminator’s batch norm statistics will be updated with both real data and fake data from the generator. When calling g_train during training I do the opposite by settingd_istraining=False
andg_istraining=True
.Is my reasoning correct?