decay, decay_rate and decay_steps not implemented
See original GitHub issueUsing latest master it seems to me that decay
, decay_rate
and decay_steps
are not affecting the learning rate at all. Looking in the trainer model, they don’t even seem to be used in the train function.
https://github.com/uber/ludwig/blob/62430e4a0dd7a4fda08d6dcd615fbdbbf53c5377/ludwig/models/trainer.py#L166-L195
learning_rate
and learning_rate_warmup_epochs
instead work fine (and I see them parsed in the train function)
Am I missing something?
Maybe it’s related to the TF2 porting?
Issue Analytics
- State:
- Created 3 years ago
- Comments:15 (9 by maintainers)
Top Results From Across the Web
Learning Rate Schedules and Adaptive Learning Rate ...
Step Decay A typical way is to to drop the learning rate by half every 10 epochs. To implement this in Keras,...
Read more >Keras learning rate schedules and decay - PyImageSearch
One popular learning rate scheduler is step-based decay where we systematically drop the learning rate after specific epochs during training.
Read more >How the parameters of decay_rate & decay_steps are taken ...
During the last couple of days, I am experimenting with the different schedulers of learning rate decay offered by Keras (link here).
Read more >Learning Rate Decay and methods in Deep Learning - Medium
Learning rate decay is a technique for training modern neural networks. It starts training the network with a large learning rate and then ......
Read more >Properly set up exponential decay of learning rate in tensorflow
To my knowledge, decay_rate should be 1 - decay_factor and decay_steps should mean how many steps are performed before applying the decay, in...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Allright, this is the new behavior after the fixes.
Fixes the place where the lr computation is done and some parameters.
Now this happens without warmup and decay:
This when warmup is on
This when decay is on
This when decay with staircase is on
And these wen warmup and decay are on at the same time
Looks reasonable to me 😃 can you doublecheck with your usecase please?
In the first graph, I guess the range set by the tensorboard doesn’t allow to see it was actually a 0.01 learning rate, but i manually checked, and the 0.01 just stayed constant throughout.
Glad this worked, merging the PR and closing.