Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Incorrect predictions when fitting a LogisticRegression model on binary outcomes with `multi_class='multinomial'`.

See original GitHub issue

Description

Incorrect predictions when fitting a LogisticRegression model on binary outcomes with multi_class='multinomial'.

Steps/Code to Reproduce

    from sklearn.linear_model import LogisticRegression
    import sklearn.metrics
    import numpy as np

    # Set up a logistic regression object
    lr = LogisticRegression(C=1000000, multi_class='multinomial',
                            solver='sag', tol=0.0001, warm_start=False,
                            verbose=0)

    # Set independent variable values
    Z = np.array([
       [ 0.        ,  0.        ],
       [ 1.33448632,  0.        ],
       [ 1.48790105, -0.33289528],
       [-0.47953866, -0.61499779],
       [ 1.55548163,  1.14414766],
       [-0.31476657, -1.29024053],
       [-1.40220786, -0.26316645],
       [ 2.227822  , -0.75403668],
       [-0.78170885, -1.66963585],
       [ 2.24057471, -0.74555021],
       [-1.74809665,  2.25340192],
       [-1.74958841,  2.2566389 ],
       [ 2.25984734, -1.75106702],
       [ 0.50598996, -0.77338402],
       [ 1.21968303,  0.57530831],
       [ 1.65370219, -0.36647173],
       [ 0.66569897,  1.77740068],
       [-0.37088553, -0.92379819],
       [-1.17757946, -0.25393047],
       [-1.624227  ,  0.71525192]])
    
    # Set dependant variable values
    Y = np.array([1, 0, 0, 1, 0, 0, 0, 0, 
                  0, 0, 1, 1, 1, 0, 0, 1, 
                  0, 0, 1, 1], dtype=np.int32)

    lr.fit(Z, Y)
    p = lr.predict_proba(Z)
    print(sklearn.metrics.log_loss(Y, p)) # ...

    print(lr.intercept_)
    print(lr.coef_)

Expected Results

If we compare against R or using multi_class='ovr', the log loss (which is approximately proportional to the objective function as the regularisation is set to be negligible through the choice of C) is incorrect. We expect the log loss to be roughly 0.5922995

Actual Results

The actual log loss when using multi_class='multinomial' is 0.61505641264.

Further Information

See the stack exchange question https://stats.stackexchange.com/questions/306886/confusing-behaviour-of-scikit-learn-logistic-regression-multinomial-optimisation?noredirect=1#comment583412_306886 for more information.

The issue it seems is caused in https://github.com/scikit-learn/scikit-learn/blob/ef5cb84a/sklearn/linear_model/logistic.py#L762. In the multinomial case even if classes.size==2 we cannot reduce to a 1D case by throwing away one of the vectors of coefficients (as we can in normal binary logistic regression). This is essentially a difference between softmax (redundancy allowed) and logistic regression.

This can be fixed by commenting out the lines 762 and 763. I am apprehensive however that this may cause some other unknown issues which is why I am positing as a bug.

Versions

Linux-4.10.0-33-generic-x86_64-with-Ubuntu-16.04-xenial Python 3.5.2 (default, Nov 17 2016, 17:05:23) NumPy 1.13.1 SciPy 0.19.1 Scikit-Learn 0.19.0

Issue Analytics

State:
Created 6 years ago
Comments:18 (18 by maintainers)

Top GitHub Comments

1reaction

TomDLTcommented, Oct 11, 2017

So to sum-up, we need to:

update predict_proba as described in https://github.com/scikit-learn/scikit-learn/issues/9889#issuecomment-335129554
update coef_'s docstring
add a test and a bugfix entry in whats_new

Do you want to do it @rwolst ?

1reaction

jnothmancommented, Oct 11, 2017

I think it would be excessive noise to warn such upon fit. why not just amend the coef_ description? most users will not be manually making probabilistic interpretations of coef_ in any case, and we can’t in general stop users misinterpreting things on the basis of assumption rather than reading the docs…

Top Results From Across the Web

Why shouldn't I use linear regression if my outcome is binary?

depends on the covariates is probably incorrect. This would manifest itself by the model predictions having poor calibration – the predicted ...

Linear or logistic regression with binary outcomes

Regression models make predictions, regression coefficients correspond to average predictions over the data, and you can use poststratification ...

Binary Logistic Regression - Towards Data Science

Logistic Regression is a classification algorithm which is used when we want to predict a categorical variable (Yes/No, Pass/Fail) based on a set...

Why Logistic Regression for Binary Response?

Logistic regression models can seem pretty overwhelming to the uninitiated. ... The predicted values can be any positive or negative number, not just...

LINEAR REGRESSION FOR BINARY OUTCOMES - OSF

Keywords: binary outcomes, logistic regression, linear regression, average treatment ... is that these models constrain predictions between 0 and 1.