Scaling issues in l-bfgs for LogisticRegression
See original GitHub issueSo it looks like l-bfgs is very sensitive to scaling of the data, which can lead to convergence issues. I feel like we might be able to fix this by changing the framing of the optimization?
example:
from sklearn.datasets import fetch_openml
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import scale
data = fetch_openml(data_id=1590, as_frame=True)
cross_val_score(LogisticRegression(), pd.get_dummies(data.data), data.target)
this gives convergence warnings, after scaling it doesn’t. I have seen this in many places. While people should scale I think warning about number of iterations is not a good thing to show to the user. If we can fix this, I think we should.
Using the bank campaign data I got coefficients that were quite different if I increased the number of iterations (I got convergence warnings with the default of 100). If I scaled the data, that issue went away.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:12 (12 by maintainers)
Top Results From Across the Web
Logistic regression and scaling of features - Cross Validated
I was under the belief that scaling of features should not affect the result of logistic regression. However, in the example below, ...
Read more >Don't Sweat the Solver Stuff. Tips for Better Logistic ...
FutureWarning: Default solver will be changed to 'lbfgs' in 0.22. ... There is no closed-form solution for logistic regression problems.
Read more >Logistic Regression Using PyTorch with L-BFGS
Dr. James McCaffrey of Microsoft Research demonstrates applying the L-BFGS optimization algorithm to the ML logistic regression technique ...
Read more >Do features need to be scaled in Logistic Regression?
thanks. I have two points. First, the documentation referred to in the answer says that lbfgs solver is robust to unscaled datasets. This...
Read more >Scaling Multinomial Logistic Regression via Hybrid Parallelism
We study the problem of scaling Multinomial Logistic Regression ... two categories: (a) data parallel methods such as L-BFGS [17] which.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The fun never ends. Here’s a toy example from my book where liblinear is worse and gives qualitatively different results?!
See https://github.com/amueller/introduction_to_ml_with_python/issues/124
+1: diagonal preconditioner, as we try to solve the canonical problem.
Good thinking @amueller, this will be useful!