question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

EMA Update of bn buffer

See original GitHub issue

The following function apply moving average to the ema model. But it didn’t update the statistic(runing_mean and runing_var) since these two were not parameters but buffers.

def update_moving_average(ema_updater, ma_model, current_model):
    for current_params, ma_params in zip(current_model.parameters(), ma_model.parameters()):
        old_weight, up_weight = ma_params.data, current_params.data
        ma_params.data = ema_updater.update_average(old_weight, up_weight)

Should I use this function instead?

def update_moving_average(ema_updater, ma_model, current_model):
    for current_params, ma_params in zip(current_model.parameters(), ma_model.parameters()):
        old_weight, up_weight = ma_params.data, current_params.data
        ma_params.data = ema_updater.update_average(old_weight, up_weight)
    for current_buffers, ma_buffers in zip(current_model.buffers(), ma_model.buffers()):
        old_weight, up_weight = ma_buffers.data, current_buffers.data
        ma_params.data = ema_updater.update_average(old_weight, up_weight)

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

2reactions
lucidrainscommented, Dec 18, 2021

@luyvlei so i think the issue is because the batchnorm statistics are already a moving average - i’ll have to read the momentum squared paper above in detail and see if the conclusions are sound

as an aside, there are papers that are starting to use SimSiam (kaimings work where the teacher is the same as the student, but with a stop gradient) successfully, and which does not require exponential moving averages as does BYOL. so i’m wondering how important these little details are, and whether it is worth the time to even debug

https://arxiv.org/abs/2111.00210 https://arxiv.org/abs/2110.05208

1reaction
luyvleicommented, Jan 6, 2022

Yep, if the target model uses train mode, the statics of BN doesn’t matter since it will never be used. And in this implementation it also use train mode. But it’s not clear that the EVAL mode would have yielded any better results @iseessel

Read more comments on GitHub >

github_iconTop Results From Across the Web

Nitrosamine impurities | European Medicines Agency
Scientific review on the risk of nitrosamine impurities in human medicines. EMA finalised a review under Article 5(3) of Regulation (EC) No 726/2004...
Read more >
Diagnosing Batch Normalization in Class Incremental Learning
To obtain a possible unbiased classifier by eliminating the BN discrepancy, we update the network parameters and the EMA statistics ...
Read more >
EMA COVID-19 treatment evaluations update
The EMA has begun a rolling review of Celltrion's regdanvimab (CT-P59) antibody for COVID-19 and assessing new data on Veklury (remdesivir).
Read more >
EMA OMMConsumer - channel out of buffers - Forum
after subscribing to a symbol list with around 1600 symbols I subscribe to those symbols individually. I start receiving updates to individual ...
Read more >
Exploring Pharmacological Mechanisms of Lavender ... - NCBI
After incubation, 5 ml of ice cold buffer were added to the ... Actions of essential oils on the central nervous system: an...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found