question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect predictions when fitting a LogisticRegression model on binary outcomes with `multi_class='multinomial'`.

See original GitHub issue

Description

Incorrect predictions when fitting a LogisticRegression model on binary outcomes with multi_class='multinomial'.

Steps/Code to Reproduce

    from sklearn.linear_model import LogisticRegression
    import sklearn.metrics
    import numpy as np

    # Set up a logistic regression object
    lr = LogisticRegression(C=1000000, multi_class='multinomial',
                            solver='sag', tol=0.0001, warm_start=False,
                            verbose=0)

    # Set independent variable values
    Z = np.array([
       [ 0.        ,  0.        ],
       [ 1.33448632,  0.        ],
       [ 1.48790105, -0.33289528],
       [-0.47953866, -0.61499779],
       [ 1.55548163,  1.14414766],
       [-0.31476657, -1.29024053],
       [-1.40220786, -0.26316645],
       [ 2.227822  , -0.75403668],
       [-0.78170885, -1.66963585],
       [ 2.24057471, -0.74555021],
       [-1.74809665,  2.25340192],
       [-1.74958841,  2.2566389 ],
       [ 2.25984734, -1.75106702],
       [ 0.50598996, -0.77338402],
       [ 1.21968303,  0.57530831],
       [ 1.65370219, -0.36647173],
       [ 0.66569897,  1.77740068],
       [-0.37088553, -0.92379819],
       [-1.17757946, -0.25393047],
       [-1.624227  ,  0.71525192]])
    
    # Set dependant variable values
    Y = np.array([1, 0, 0, 1, 0, 0, 0, 0, 
                  0, 0, 1, 1, 1, 0, 0, 1, 
                  0, 0, 1, 1], dtype=np.int32)

    lr.fit(Z, Y)
    p = lr.predict_proba(Z)
    print(sklearn.metrics.log_loss(Y, p)) # ...

    print(lr.intercept_)
    print(lr.coef_)

Expected Results

If we compare against R or using multi_class='ovr', the log loss (which is approximately proportional to the objective function as the regularisation is set to be negligible through the choice of C) is incorrect. We expect the log loss to be roughly 0.5922995

Actual Results

The actual log loss when using multi_class='multinomial' is 0.61505641264.

Further Information

See the stack exchange question https://stats.stackexchange.com/questions/306886/confusing-behaviour-of-scikit-learn-logistic-regression-multinomial-optimisation?noredirect=1#comment583412_306886 for more information.

The issue it seems is caused in https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/linear_model/logistic.py#L762. In the multinomial case even if classes.size==2 we cannot reduce to a 1D case by throwing away one of the vectors of coefficients (as we can in normal binary logistic regression). This is essentially a difference between softmax (redundancy allowed) and logistic regression.

This can be fixed by commenting out the lines 762 and 763. I am apprehensive however that this may cause some other unknown issues which is why I am positing as a bug.

Versions

Linux-4.10.0-33-generic-x86_64-with-Ubuntu-16.04-xenial Python 3.5.2 (default, Nov 17 2016, 17:05:23) NumPy 1.13.1 SciPy 0.19.1 Scikit-Learn 0.19.0

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:18 (18 by maintainers)

github_iconTop GitHub Comments

1reaction
TomDLTcommented, Oct 11, 2017

So to sum-up, we need to:

Do you want to do it @rwolst ?

1reaction
jnothmancommented, Oct 11, 2017

I think it would be excessive noise to warn such upon fit. why not just amend the coef_ description? most users will not be manually making probabilistic interpretations of coef_ in any case, and we can’t in general stop users misinterpreting things on the basis of assumption rather than reading the docs…

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why shouldn't I use linear regression if my outcome is binary?
depends on the covariates is probably incorrect. This would manifest itself by the model predictions having poor calibration – the predicted ...
Read more >
Linear or logistic regression with binary outcomes
Regression models make predictions, regression coefficients correspond to average predictions over the data, and you can use poststratification ...
Read more >
Binary Logistic Regression - Towards Data Science
Logistic Regression is a classification algorithm which is used when we want to predict a categorical variable (Yes/No, Pass/Fail) based on a set...
Read more >
Why Logistic Regression for Binary Response?
Logistic regression models can seem pretty overwhelming to the uninitiated. ... The predicted values can be any positive or negative number, not just...
Read more >
LINEAR REGRESSION FOR BINARY OUTCOMES - OSF
Keywords: binary outcomes, logistic regression, linear regression, average treatment ... is that these models constrain predictions between 0 and 1.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found