question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

YOGI Initialization

See original GitHub issue

exp_avg_sq Initialization

“Thus, for YOGI, we propose to initialize the vt based on gradient square evaluated at the initial point averaged over a (reasonably large) mini-batch.”

The initial exp_avg_sq should be initialized to the gradient square.

exp_avg Initialization

image

The YOGI optimizer exp_avg should be initialized to zero instead of initial_accumulator based on m0 above.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
avinashsaicommented, Mar 7, 2020

sure, will submit PR with the change

1reaction
avinashsaicommented, Mar 7, 2020

@PetrochukM Even I had doubts regarding this. So, I referred to the author’s official implementation in tensorflow (https://github.com/tensorflow/addons/blob/master/tensorflow_addons/optimizers/yogi.py).

In line 119, they initialized first and second moments with a constant value.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Requirement Yogi - Requirement Yogi
Requirement Yogi is a requirement management tool. ... Integrate Requirement Yogi with external tools. ... Save time on project initialization.
Read more >
torch_optimizer.yogi — pytorch-optimizer documentation
[docs]class Yogi(Optimizer): r"""Implements Yogi Optimizer Algorithm. It has been proposed in `Adaptive methods for Nonconvex Optimization`__.
Read more >
tfa.optimizers.Yogi | TensorFlow Addons
Optimizer that implements the Yogi algorithm in Keras. ... var, slot_name, initializer='zeros', shape=None
Read more >
12.10. Adam — Dive into Deep Learning 1.0.0-beta0 ...
(2018) proposed a hotfix to Adam, called Yogi which addresses these issues. ... could be fixed by a slightly different initialization and update...
Read more >
Adaptive Methods for Nonconvex Optimization
Initialization of mt and vt are also important for YOGI and ADAM. These are often initialized with 0 in conjunction with debiasing strategies...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found