Clearer definition of alpha in KRR
See original GitHub issueDescribe the issue linked to the documentation
In the docs for KRR, it is not especially clear to me what exactly alpha
is defined as. When looking at the cited reference in “Machine Learning: A Probabilistic Perspective” (and nearly every other reference for KRR), the expression for the vector of weights is given by
w = (K+λI)^{-1}y. How does alpha
relate to λ? As suggested in the docs, alpha
is defined as (2*C)^{-1}
, so I checked out the documentation for LogisticRegression
and found that C
is the inverse of the regularization strength. Does that make alpha
= (2*1/λ)^{-1} = λ/2?
Suggest a potential alternative/fix
In general, I feel like the documentation for the hyperparameters could be made clearer. In the case of something like KRR especially, where there is a closed-form solution, it would be even better if the equation were included on the page where the arguments clearly correspond to the closed-form solution. Even without this, I still remain confused about what precisely alpha
is. A clearer explanation, without directing the reader to other functions, would be helpful.
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (5 by maintainers)
Top GitHub Comments
Sadly, every author has their own convention for naming things. It takes practice and patience to get used to it and switch between conventions. In scikit-learn, we should be concerned about being consistent with ourselves, but we can’t follow the same notation used in every book (because there are so many).
Note that the docstring for alpha was recently updated, to clarify that it’s the regularization parameter and with a link to the forumla. The latest docs are at https://scikit-learn.org/dev/modules/generated/sklearn.kernel_ridge.KernelRidge.html#sklearn.kernel_ridge.KernelRidge
Thanks both, yes I think the new version is good enough. Closing the issue!