BayesianRidge fails when input and output data are of very different sizes
See original GitHub issueExample 1 (working):
from sklearn.linear_model import LinearRegression, BayesianRidge
import numpy as np
ols = LinearRegression()
ols.fit(np.reshape([1,2],(-1,1)), np.array([2,3]).ravel())
clf = BayesianRidge(compute_score=True, fit_intercept=False)
clf.fit(np.array([[1,1],[1,2]]), np.array([2,3]).ravel())
print(str(ols.intercept_) + " " + str(ols.coef_[0]))
print(str(clf.coef_[0]) + " " + str(clf.coef_[1]))
Expected Results
1,1
Results for OLS and BayesianRidge:
1.0000000000000004 0.9999999999999998 0.9988917252390923 1.0005536752418909
Example 2 (not working):
from sklearn.linear_model import LinearRegression, BayesianRidge
import numpy as np
ols = LinearRegression()
ols.fit(np.reshape([1,2],(-1,1)), np.array([2000000,3000000]).ravel())
clf = BayesianRidge(compute_score=True, fit_intercept=False)
clf.fit(np.array([[1,1],[1,2]]), np.array([2000000,3000000]).ravel())
print(str(ols.intercept_) + " " + str(ols.coef_[0]))
print(str(clf.coef_[0]) + " " + str(clf.coef_[1]))
Expected Results
1000000, 1000000
Results for OLS and BayesianRidge:
1000000.0000000005 999999.9999999997 7.692319353001738e-07 1.2307710964802638e-06
Please notice that the only difference betweenn the two examples are the order of magnitude of the endogenous variable! However, although OLS works pretty well, in the second case we get 0,0 as the coefficients from the Bayes regression"
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (2 by maintainers)
Top Results From Across the Web
sklearn.linear_model.BayesianRidge
Defined only when X has feature names that are all strings. New in version 1.0. See also. ARDRegression. Bayesian ARD regression.
Read more >A simulation study on Bayesian Ridge regression models ...
results show that Bayesian method gives better performance for relatively small sample sizes, and for other settings the method.
Read more >Chapter 6 Introduction to Bayesian Regression
We will first apply Bayesian statistics to simple linear regression models, then generalize the results to multiple linear regression models.
Read more >A Bayesian Genomic Multi-output Regressor Stacking Model ...
Obviously, these applications give rise to many challenges such as missing data, the presence of noise that is typical due to the ...
Read more >Basic Machine Learning Cheatsheet using Python [10 ...
The model has both input and output used for training. ... It provides many models for Machine Learning. ... Bayesian Ridge Regression.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It is not a bug. It is caused by a bad assumption of prior distribution.
BayesianRidge make two assumption.
Both distribution have two parameters(alpha_1, alpha_2, lambda_1, lambda_2). These parameters is set before training. And default value is 1e-6.
This default means that weights and noise likely have small value. In other words, before training, probability for small weight is high, but probability for large weight is low. But your data requires large weights.
Of course, model can learn from data, and fit posterior distribution. But posterior distribution is still affected by prior distribution(which prefer low weights). So model can’t estimate good weight.
To fix it, you can do below.
Altenative way is use fit_intercept=True
out put is below.
@SB6988 I think the best we can offer is to improve the docstring of the class. Feel free to open a PR so a next person is less likely to hit the same difficulty as you.