LinearRegression is sometimes not deterministic
See original GitHub issueLinearRegression
is sometimes not deterministic, as shown by the following test:
from sklearn.linear_model import LinearRegression
from sklearn.utils.testing import assert_array_equal
from sklearn.tests.test_multioutput import generate_multilabel_dataset_with_correlations
def test_investigate_linear_regression_indeterminacy():
# Is LinearRegression deterministic?
X, Y = generate_multilabel_dataset_with_correlations()
y = Y[:, 1]
ref = LinearRegression().fit(X, y).coef_
for i in range(1000):
coef = LinearRegression().fit(X, y).coef_
assert_array_equal(ref, coef, 'iter %d' % i)
test_investigate_linear_regression_indeterminacy()
which failed at iteration 18 in this Travis job. I can’t reproduce locally. It only occurs on one Travis platform (Linux + python 3.6.2), only sometimes (iteration 18 in this test case).
Failure spotted initially in https://github.com/scikit-learn/scikit-learn/pull/9257#issuecomment-350111442
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (4 by maintainers)
Top Results From Across the Web
Why is linear regression not able to predict the outcome of a ...
Linear regression would fit and extrapolate deterministic data if it were linear. It would fail to predict well for probabilistic data if the...
Read more >Linear Regression — Detailed View - Towards Data Science
It looks for statistical relationship but not deterministic relationship. Relationship between two variables is said to be deterministic if ...
Read more >STAT 515 -- Chapter 11: Regression
Often we assume a straight-line relationship between two variables. • This is known as simple linear regression. Probabilistic vs. Deterministic Models.
Read more >Linear Regression : Statistical Relationship between two ...
Indeed, the plot exhibits some “trend,” but it also exhibits some “scatter.” Therefore, it is a statistical relationship, not a deterministic ...
Read more >Lesson 1: Simple Linear Regression - STAT ONLINE
Therefore, it is a statistical relationship, not a deterministic one. ... It is also sometimes called the "estimated regression equation.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks @maxsargent !
+1 for closing, due to,
assert_array_equal
being too strict for floating point comparison (as compared toassert_allclose
) an can lead to false positives (as discussed in the linked scipy issue BLAS implementations can be non-deterministic within the float eps tolerance).assert_array_equal
the discussion in the corresponding scipy issue https://github.com/scipy/scipy/issues/8208 is insightful (and remains open).
Please comment/reopen in case of disagreement.
This issue has come to me through a triage email.
From a brief look at Travis this issue is no longer occurring, I have tried this code with the Python version mentioned (3.6.2) and it is not reproducible outside of Travis.
Due to the extended period of time this issue has been open and the fact it cant reproduced I am suggesting it is closed out.