How does Adagrad work in keras?
See original GitHub issueHi all,
I post a question about keras’ optimizer in stackoverflow. Can anybody help me to answer that?
The function get_update()
in Adagrad seems one step update. However should the accumulators be stored the history information? Why it has been initialized to zeros at each step? How it can be an accumulator through the whole training process?
What does this line do?
self.weights = accumulators
It seems self.weights is never been called anymore.
Thanks.
Issue Analytics
- State:
- Created 7 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
tf.keras.optimizers.experimental.Adagrad | TensorFlow v2.11.0
Adagrad is an optimizer with parameter-specific learning rates, which are adapted relative to how frequently a parameter gets updated during training.
Read more >Guide To Tensorflow Keras Optimizers
Adagrad adapts the learning rate specifically with individual features: it means that some of the weights in your dataset have different ...
Read more >Gradient Descent With AdaGrad From Scratch
AdaGrad is designed to specifically explore the idea of automatically tailoring the step size for each dimension in the search space. The ...
Read more >How Adagrad works in Keras? What does self.weights mean ...
1 Answer 1 ... You are correct.. for all optimizers in Keras get_updates() implements the tensor logic for one step of updates. This...
Read more >A (Quick) Guide to Neural Network Optimizers with ...
Adagrad adapts the learning rate to the parameters, performing smaller updates (low learning rates) for parameters associated with frequently ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
You should do some reading on how theano and tensorflow build graphs.
get_update
builds a graph that can be used to get updates, it doesn’t actually perform the updates. Those updates are then run by the training function.It is a weird way to start thinking about things. Python code builds the graph and then tells the GPU each time it needs to be run.
The accumulator is created locally by get_updates and initialized to zeros. get_updates is called once. Each time the updates returned from get_updates are run, they update that same local variable.
Cheers
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs, but feel free to re-open it if needed.