Grad of weight decay is wrong when using `update_rule.add_hook`
See original GitHub issueThank you for fixing this issue https://github.com/chainer/chainer/issues/7335
But, the grad of wight decay is wrong with this modification when using update_rule.add_hook
.
The flow of the update is as follows.
- call hook of an optimizer https://github.com/chainer/chainer/blob/5289f671411b7eaf90492df6463ff51dd0724e91/chainer/optimizer.py#L810
- divide grad by loss scale https://github.com/chainer/chainer/blob/5289f671411b7eaf90492df6463ff51dd0724e91/chainer/optimizer.py#L206
- call hook of an
UpdateRule
https://github.com/chainer/chainer/blob/5289f671411b7eaf90492df6463ff51dd0724e91/chainer/optimizer.py#L208 - update parameters
so, grad of weight decay is loss_scale
times larger than the expected value when using update_rule.add_hook
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Weight Decay and Its Peculiar Effects - Towards Data Science
This also shows that weight decay will have a negative impact if the model is originally operating in the under-fitting region.
Read more >Difference between neural net weight decay and learning rate
The learning rate is a parameter that determines how much an updating step influences the current value of the weights. While weight decay...
Read more >Weight Decay in Machine Learning: Concepts - Data Analytics
Weight decay can be implemented by modifying the update rule for the weights such that the gradient is not only based on the...
Read more >Understanding and Scheduling Weight Decay | OpenReview
Weight decay is a popular and even necessary regularization technique for training ... Using a too large learning rate may cause bad convergence...
Read more >Weight decay in the optimizers is a bad idea (especially with ...
Correct me if I'm wrong, but there is no reason the beta and gamma parameters in BatchNorm should ever be subject to weight...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Done.
This issue is closed as announced. Feel free to re-open it if needed.