Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How does Adagrad work in keras?

See original GitHub issue

Hi all,

I post a question about keras’ optimizer in stackoverflow. Can anybody help me to answer that?

The function get_update() in Adagrad seems one step update. However should the accumulators be stored the history information? Why it has been initialized to zeros at each step? How it can be an accumulator through the whole training process?

What does this line do? self.weights = accumulators It seems self.weights is never been called anymore.

Thanks.

Issue Analytics

State:
Created 7 years ago
Comments:6 (3 by maintainers)

Top GitHub Comments

2reactions

bstrinercommented, Jan 22, 2017

You should do some reading on how theano and tensorflow build graphs. get_update builds a graph that can be used to get updates, it doesn’t actually perform the updates. Those updates are then run by the training function.

It is a weird way to start thinking about things. Python code builds the graph and then tells the GPU each time it needs to be run.

The accumulator is created locally by get_updates and initialized to zeros. get_updates is called once. Each time the updates returned from get_updates are run, they update that same local variable.

Cheers

0reactions

stale[bot]commented, May 23, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.

Top Results From Across the Web

tf.keras.optimizers.experimental.Adagrad | TensorFlow v2.11.0

Adagrad is an optimizer with parameter-specific learning rates, which are adapted relative to how frequently a parameter gets updated during training.

Guide To Tensorflow Keras Optimizers

Adagrad adapts the learning rate specifically with individual features: it means that some of the weights in your dataset have different ...

Gradient Descent With AdaGrad From Scratch

AdaGrad is designed to specifically explore the idea of automatically tailoring the step size for each dimension in the search space. The ...

How Adagrad works in Keras? What does self.weights mean ...

1 Answer 1 ... You are correct.. for all optimizers in Keras get_updates() implements the tensor logic for one step of updates. This...

A (Quick) Guide to Neural Network Optimizers with ...

Adagrad adapts the learning rate to the parameters, performing smaller updates (low learning rates) for parameters associated with frequently ...