question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Is it possible to manually reset momentum in SGD?

See original GitHub issue

Context: I am running the same optimizer multiple times for cross-validation and trying to completely reset it at each run so I can avoid recompiling exactly the same thing as before. My data is relatively small and my GPU is quite fast so it converges very quickly and the compilation time becomes comparable to the processing time.

What I have tried: to reinitialize stuff I stumbled upon this issue and simply saved / loaded my initial parameters at each fold. Apparently it was working until I decided to include momentum on it. My hyper-parameter optimizer converged to a configuration with a very high momentum and was giving absurdly good results, then I realized my mistake of forgetting to reinitialize the momentum shared variable: there was information leaking from fold to fold.

I gave a quick read in the code in optimizers.py and as I understand the shared variables m which hold the momentum value are kept internal so I don’t have access to it.

TLDR:

How can I have access to the shared variables holding the momentum value?

Is there any built-in solution in keras for cross-validation that does not involve recompiling everything again?

Are there any other shared variables that could be leaking information from one fold to another?

Are there plans to add a proper reinitialization feature to keras?

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:9 (3 by maintainers)

github_iconTop GitHub Comments

7reactions
mrgloomcommented, Jun 28, 2019

Seems get_state and set_state were removed, what is the current method to reset optimizer state?

5reactions
fcholletcommented, Jul 28, 2015

Successive calls to fit do not reset any of the parameters of the model, including the state of the optimizers. Successive calls to fit with nb_epoch = 1 is effectively the same as a single call to fit.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Scheduled Restart Momentum for Accelerated ... - OpenReview
In this paper, we propose scheduled restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD...
Read more >
[2002.10583] Scheduled Restart Momentum for Accelerated ...
In this paper, we propose Scheduled Restart SGD (SRSGD), a new NAG-style scheme for training DNNs. SRSGD replaces the constant momentum in SGD...
Read more >
How does Stochastic Gradient Descent with momentum ...
I read that "SGD momentum goes past the minima (due to its velocity build up) and then correct themselves and then comes back...
Read more >
Stochastic Gradient Descent with momentum | by Vitaly Bushaev
Momentum [1] or SGD with momentum is method which helps accelerate gradients vectors in the right directions, thus leading to faster converging.
Read more >
How to implement momentum in mini-batch gradient descent ...
Now the regression line is calculated correctly (maybe). With SGD the final error is 59706304 and with momentum the final error is 56729062,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found