Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is it possible to manually reset momentum in SGD?

See original GitHub issue

Context: I am running the same optimizer multiple times for cross-validation and trying to completely reset it at each run so I can avoid recompiling exactly the same thing as before. My data is relatively small and my GPU is quite fast so it converges very quickly and the compilation time becomes comparable to the processing time.

What I have tried: to reinitialize stuff I stumbled upon this issue and simply saved / loaded my initial parameters at each fold. Apparently it was working until I decided to include momentum on it. My hyper-parameter optimizer converged to a configuration with a very high momentum and was giving absurdly good results, then I realized my mistake of forgetting to reinitialize the momentum shared variable: there was information leaking from fold to fold.

I gave a quick read in the code in optimizers.py and as I understand the shared variables m which hold the momentum value are kept internal so I don’t have access to it.

TLDR:

How can I have access to the shared variables holding the momentum value?

Is there any built-in solution in keras for cross-validation that does not involve recompiling everything again?

Are there any other shared variables that could be leaking information from one fold to another?

Are there plans to add a proper reinitialization feature to keras?

Issue Analytics

State:
Created 8 years ago
Comments:9 (3 by maintainers)

Top GitHub Comments

7reactions

mrgloomcommented, Jun 28, 2019

Seems get_state and set_state were removed, what is the current method to reset optimizer state?

5reactions

fcholletcommented, Jul 28, 2015

Successive calls to fit do not reset any of the parameters of the model, including the state of the optimizers. Successive calls to fit with nb_epoch = 1 is effectively the same as a single call to fit.