Clearer definition of alpha in KRR
See original GitHub issueDescribe the issue linked to the documentation
In the docs for KRR, it is not especially clear to me what exactly alpha is defined as. When looking at the cited reference in “Machine Learning: A Probabilistic Perspective” (and nearly every other reference for KRR), the expression for the vector of weights is given by
w = (K+λI)^{-1}y. How does alpha relate to λ? As suggested in the docs, alpha is defined as (2*C)^{-1}, so I checked out the documentation for LogisticRegression and found that C is the inverse of the regularization strength. Does that make alpha = (2*1/λ)^{-1} = λ/2?
Suggest a potential alternative/fix
In general, I feel like the documentation for the hyperparameters could be made clearer. In the case of something like KRR especially, where there is a closed-form solution, it would be even better if the equation were included on the page where the arguments clearly correspond to the closed-form solution. Even without this, I still remain confused about what precisely alpha is. A clearer explanation, without directing the reader to other functions, would be helpful.
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (5 by maintainers)

Top Related StackOverflow Question
Sadly, every author has their own convention for naming things. It takes practice and patience to get used to it and switch between conventions. In scikit-learn, we should be concerned about being consistent with ourselves, but we can’t follow the same notation used in every book (because there are so many).
Note that the docstring for alpha was recently updated, to clarify that it’s the regularization parameter and with a link to the forumla. The latest docs are at https://scikit-learn.org/dev/modules/generated/sklearn.kernel_ridge.KernelRidge.html#sklearn.kernel_ridge.KernelRidge
Thanks both, yes I think the new version is good enough. Closing the issue!