EMA Update of bn buffer
See original GitHub issueThe following function apply moving average to the ema model. But it didn’t update the statistic(runing_mean and runing_var) since these two were not parameters but buffers.
def update_moving_average(ema_updater, ma_model, current_model):
for current_params, ma_params in zip(current_model.parameters(), ma_model.parameters()):
old_weight, up_weight = ma_params.data, current_params.data
ma_params.data = ema_updater.update_average(old_weight, up_weight)
Should I use this function instead?
def update_moving_average(ema_updater, ma_model, current_model):
for current_params, ma_params in zip(current_model.parameters(), ma_model.parameters()):
old_weight, up_weight = ma_params.data, current_params.data
ma_params.data = ema_updater.update_average(old_weight, up_weight)
for current_buffers, ma_buffers in zip(current_model.buffers(), ma_model.buffers()):
old_weight, up_weight = ma_buffers.data, current_buffers.data
ma_params.data = ema_updater.update_average(old_weight, up_weight)
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
Top Results From Across the Web
Nitrosamine impurities | European Medicines Agency
Scientific review on the risk of nitrosamine impurities in human medicines. EMA finalised a review under Article 5(3) of Regulation (EC) No 726/2004...
Read more >Diagnosing Batch Normalization in Class Incremental Learning
To obtain a possible unbiased classifier by eliminating the BN discrepancy, we update the network parameters and the EMA statistics ...
Read more >EMA COVID-19 treatment evaluations update
The EMA has begun a rolling review of Celltrion's regdanvimab (CT-P59) antibody for COVID-19 and assessing new data on Veklury (remdesivir).
Read more >EMA OMMConsumer - channel out of buffers - Forum
after subscribing to a symbol list with around 1600 symbols I subscribe to those symbols individually. I start receiving updates to individual ...
Read more >Exploring Pharmacological Mechanisms of Lavender ... - NCBI
After incubation, 5 ml of ice cold buffer were added to the ... Actions of essential oils on the central nervous system: an...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@luyvlei so i think the issue is because the batchnorm statistics are already a moving average - i’ll have to read the momentum squared paper above in detail and see if the conclusions are sound
as an aside, there are papers that are starting to use SimSiam (kaimings work where the teacher is the same as the student, but with a stop gradient) successfully, and which does not require exponential moving averages as does BYOL. so i’m wondering how important these little details are, and whether it is worth the time to even debug
https://arxiv.org/abs/2111.00210 https://arxiv.org/abs/2110.05208
Yep, if the target model uses train mode, the statics of BN doesn’t matter since it will never be used. And in this implementation it also use train mode. But it’s not clear that the EVAL mode would have yielded any better results @iseessel