Check finite difference gradient approximation in a random direction
See original GitHub issuescipy.optimize.check_grad
is very convenient for checking analytical gradient calculations.
Unfortunately, it’s not really practical for even moderately large problems, because computing finite differences in every direction can be painfully slow.
One practical way to test such gradients, which I learned from the autograd project, is calculate finite differences in a random direction instead, which can be compared to a projection of the analytical gradient. This isn’t as thorough as check_grad
but in practice seems to catch most of the same bugs, and runs way faster.
Would something like this be welcome in SciPy, maybe in the form of a random_projection=True
keyword argument for check_grads
?
Issue Analytics
- State:
- Created 4 years ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
How to test gradient implementations — Graduate Descent
The main way that people test gradient computation is by comparing it against a finite-difference (FD) approximation to the gradient:.
Read more >Random-Direction Optimization Algorithms with Applications ...
in stochastic approximation, we suggest a class of algorithms using random- direction finite-difference gradient estimates, investigate the convergence,.
Read more >Finite Difference Gradient Approximation: To Randomize or Not?
A line search algorithm based on FD estimates can utilize quasi-Newton Hessian approximations, whereas doing the same using RFD estimates is ...
Read more >Checking the gradient and Hessian implementation using ...
Checking the gradient and Hessian implementation using finite differences. Solving a numerical optimization problem depends not only on the ...
Read more >A Theoretical and Empirical Comparison of Gradient ...
sample points and the way in which the gradient approximations are derived. ... finite differences where N = 2n and the set of...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi, shoyer. One question. Why should we need to compute finite differences in every direction to compare analytical gradients with numerical ones??
Thanks @shoyer for the ideas and detailed explanation.
I was thinking the same. Using one of all the one-hot vector might be beneficial in the sense that we won’t be required to take the projection of analytical gradient, instead just indexing it will work. But yeah, bugs may escape. Using a fully random normalized vector would be much better.
I will raise a PR on Monday for this and we can implement and discuss different ways there. Thanks.