Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

EMA Update of bn buffer

See original GitHub issue

The following function apply moving average to the ema model. But it didn’t update the statistic(runing_mean and runing_var) since these two were not parameters but buffers.

def update_moving_average(ema_updater, ma_model, current_model):
    for current_params, ma_params in zip(current_model.parameters(), ma_model.parameters()):
        old_weight, up_weight = ma_params.data, current_params.data
        ma_params.data = ema_updater.update_average(old_weight, up_weight)

Should I use this function instead?

def update_moving_average(ema_updater, ma_model, current_model):
    for current_params, ma_params in zip(current_model.parameters(), ma_model.parameters()):
        old_weight, up_weight = ma_params.data, current_params.data
        ma_params.data = ema_updater.update_average(old_weight, up_weight)
    for current_buffers, ma_buffers in zip(current_model.buffers(), ma_model.buffers()):
        old_weight, up_weight = ma_buffers.data, current_buffers.data
        ma_params.data = ema_updater.update_average(old_weight, up_weight)

Issue Analytics

State:
Created 2 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

2reactions

lucidrainscommented, Dec 18, 2021

@luyvlei so i think the issue is because the batchnorm statistics are already a moving average - i’ll have to read the momentum squared paper above in detail and see if the conclusions are sound

as an aside, there are papers that are starting to use SimSiam (kaimings work where the teacher is the same as the student, but with a stop gradient) successfully, and which does not require exponential moving averages as does BYOL. so i’m wondering how important these little details are, and whether it is worth the time to even debug

https://arxiv.org/abs/2111.00210 https://arxiv.org/abs/2110.05208

1reaction

luyvleicommented, Jan 6, 2022

Yep, if the target model uses train mode, the statics of BN doesn’t matter since it will never be used. And in this implementation it also use train mode. But it’s not clear that the EVAL mode would have yielded any better results @iseessel

Top Results From Across the Web

Nitrosamine impurities | European Medicines Agency

Scientific review on the risk of nitrosamine impurities in human medicines. EMA finalised a review under Article 5(3) of Regulation (EC) No 726/2004...

Diagnosing Batch Normalization in Class Incremental Learning

To obtain a possible unbiased classifier by eliminating the BN discrepancy, we update the network parameters and the EMA statistics ...

EMA COVID-19 treatment evaluations update

The EMA has begun a rolling review of Celltrion's regdanvimab (CT-P59) antibody for COVID-19 and assessing new data on Veklury (remdesivir).

EMA OMMConsumer - channel out of buffers - Forum

after subscribing to a symbol list with around 1600 symbols I subscribe to those symbols individually. I start receiving updates to individual ...

Exploring Pharmacological Mechanisms of Lavender ... - NCBI

After incubation, 5 ml of ice cold buffer were added to the ... Actions of essential oils on the central nervous system: an...