question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: liblinear stucked on iris after centering the data

See original GitHub issue

Describe the bug

The example on sparse Logreg surprised me because even going at extremely low regularizations (large C), only 1 feature enters the model: image

Investigation lead me to check if StandardScaling X changed the graph. When X is preprocessed, the liblinear solver does not seem to converge.

Steps/Code to Reproduce

from sklearn.svm import l1_min_c
from sklearn import datasets
from sklearn import linear_model
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import numpy as np
from time import time
print(__doc__)

# Author: Alexandre Gramfort <alexandre.gramfort@inria.fr>
# License: BSD 3 clause


iris = datasets.load_iris()
X = iris.data
y = iris.target

X = X[y != 2]
y = y[y != 2]


# #############################################################################
# Demo path functions

cs = l1_min_c(X, y, loss='log') * np.logspace(0, 10, 30)


X = StandardScaler().fit_transform(X)
print("Computing regularization path ...")
start = time()
clf = linear_model.LogisticRegression(penalty='l1', solver='liblinear',
                                      tol=1e-6, max_iter=int(1e6),
                                      warm_start=True,
                                      intercept_scaling=10000.)
coefs_ = []
for c in cs:
    clf.set_params(C=c)
    clf.fit(X, y)
    coefs_.append(clf.coef_.ravel().copy())
    print(clf.coef_.ravel(), clf.intercept_)


Expected Results

the code should run fast (iris has 4 features)

Actual Results

the code gets stuck after the first regularization parameter

I guess something is happening with the scaled intercept column added because liblinear does not fit an unregularized intercept. Going to a lower intercept scaling (intercept_scaling=1) gives me the way more reasonable graph: image

where I still suspect numerical errors to be responsible for the bumps when C becomes large

Versions

Happy to help if it’s a known issue @agramfort

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

4reactions
Rick-Mackenbachcommented, Oct 22, 2020

I’ll work on this, curious to see what is happening here.

0reactions
TomDLTcommented, Dec 19, 2022

See proposed fix in #25214

Another possibility would be to fix LIBLINEAR to avoid regularizing the intercept, backporting the fix from https://github.com/cjlin1/liblinear/commit/f68d25cc425a057cd8cdcce1554bce0172a245e8.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Bug listing with status CONFIRMED as at 2022/12/20 18:46:38
splunk - the search engine for IT data" status:CONFIRMED resolution: severity:enhancement ... Bug:288868 - "Live DVD 10.1 hangs after loading module tg3" ...
Read more >
Hands-on Machine Learning: Scikit-Learn - A Hugo website
Since labeling data is usually time-consuming and costly, you will often have plenty of unlabeled instances, and few labeled instances.
Read more >
scikit-learn user guide
Paris-Saclay Center for Data Science funded one year for a ... Fixed an off-by-one error in the SVMlight/LibSVM file format handling; ...
Read more >
A high-bias, low-variance introduction to Machine Learning for ...
Hence, after a quick initial drop (not shown in figure), the in-sample error will increase with the number of data points, ...
Read more >
Anatomy of the Eye | Kellogg Eye Center | Michigan Medicine
How the eye works and descriptions and functions of the major structures of the human eye: Choroid, Cornea, Fovea , Iris , Macula...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found