BUG: liblinear stucked on iris after centering the data
See original GitHub issueDescribe the bug
The example on sparse Logreg surprised me because even going at extremely low regularizations (large C), only 1 feature enters the model:
Investigation lead me to check if StandardScaling X changed the graph. When X is preprocessed, the liblinear solver does not seem to converge.
Steps/Code to Reproduce
from sklearn.svm import l1_min_c
from sklearn import datasets
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import numpy as np
from time import time
print(__doc__)
# Author: Alexandre Gramfort <alexandre.gramfort@inria.fr>
# License: BSD 3 clause
iris = datasets.load_iris()
X = iris.data
y = iris.target
X = X[y != 2]
y = y[y != 2]
# #############################################################################
# Demo path functions
cs = l1_min_c(X, y, loss='log') * np.logspace(0, 10, 30)
X = StandardScaler().fit_transform(X)
print("Computing regularization path ...")
start = time()
clf = linear_model.LogisticRegression(penalty='l1', solver='liblinear',
tol=1e-6, max_iter=int(1e6),
warm_start=True,
intercept_scaling=10000.)
coefs_ = []
for c in cs:
clf.set_params(C=c)
clf.fit(X, y)
coefs_.append(clf.coef_.ravel().copy())
print(clf.coef_.ravel(), clf.intercept_)
Expected Results
the code should run fast (iris has 4 features)
Actual Results
the code gets stuck after the first regularization parameter
I guess something is happening with the scaled intercept column added because liblinear does not fit an unregularized intercept.
Going to a lower intercept scaling (intercept_scaling=1) gives me the way more reasonable graph:
where I still suspect numerical errors to be responsible for the bumps when C becomes large
Versions
Happy to help if it’s a known issue @agramfort
Issue Analytics
- State:
- Created 3 years ago
- Comments:11 (11 by maintainers)
Top Results From Across the Web
Bug listing with status CONFIRMED as at 2022/12/20 18:46:38
splunk - the search engine for IT data" status:CONFIRMED resolution: severity:enhancement ... Bug:288868 - "Live DVD 10.1 hangs after loading module tg3" ...
Read more >Hands-on Machine Learning: Scikit-Learn - A Hugo website
Since labeling data is usually time-consuming and costly, you will often have plenty of unlabeled instances, and few labeled instances.
Read more >scikit-learn user guide
Paris-Saclay Center for Data Science funded one year for a ... Fixed an off-by-one error in the SVMlight/LibSVM file format handling; ...
Read more >A high-bias, low-variance introduction to Machine Learning for ...
Hence, after a quick initial drop (not shown in figure), the in-sample error will increase with the number of data points, ...
Read more >Anatomy of the Eye | Kellogg Eye Center | Michigan Medicine
How the eye works and descriptions and functions of the major structures of the human eye: Choroid, Cornea, Fovea , Iris , Macula...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I’ll work on this, curious to see what is happening here.
See proposed fix in #25214
Another possibility would be to fix LIBLINEAR to avoid regularizing the intercept, backporting the fix from https://github.com/cjlin1/liblinear/commit/f68d25cc425a057cd8cdcce1554bce0172a245e8.