Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

.predict_proba() for SVC produces incorrect results for binary classification

See original GitHub issue

Description

svm.predict_proba() produces revered results for binary classification

this seems to be specific to binary classification. For example, it works fine for 3 way classification, which is in the test: https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/svm/tests/test_svm.py#L307
I think this is related to #394, but in this case we are using the same svm object so I think predict and predict_proba should agree
Acknowledgement: @amennen noticed this error 👍

Steps/Code to Reproduce

here’s the code on colab: https://colab.research.google.com/github/qihongl/random/blob/master/sklearn-svm-predict-proba-bug.ipynb

# train a svm 
X = np.array([[-1, -1], [1, 1]])
y = np.array([0, 1])
svm = SVC(probability=True)
svm.fit(X, y) 

# SVM makes reasonable prediction on the learned examples... 
print(svm.predict(X))

# but it makes the reversed probability estimates... 
print(svm.predict_proba(X))

Expected Results

[0 1]
[[0.66383953 0.33616047]
 [0.33916469 0.66083531]]

# i.e. when the prediction is class 0, prob(class0) should be bigger than prob(class1)

Actual Results

[0 1]
[[0.33616047 0.66383953]
 [0.66083531 0.33916469]]

# i.e. when the prediction is class 0, prob(class0) < prob(class1)

Versions

0.20.3

Issue Analytics

State:
Created 4 years ago
Comments:22 (12 by maintainers)

Top GitHub Comments

1reaction

amuellercommented, Apr 1, 2020

I came to the same conclusion in my analysis, and could also reproduce the same behavior with CalibratedClassifierCV when not using stratification. This is the evil of LOO in small noisy datasets.

So I agree with the closing this as a separate bug.

0reactions

123gkccommented, Mar 11, 2022

Reopening this as I ran into the same issue. According to my observations, the issue of .predict() and argmax of .predict_proba() not tallying only comes up when class_weights are enabled. Sharing modified code of @rth to recreate the issue :

import numpy as np

from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_digits
from sklearn.metrics import accuracy_score
from sklearn.svm import SVC

X, y = load_digits(return_X_y=True)

mask = (y == 0) | (y == 1)
X = X[mask, :]
y = y[mask]  # make a binary classification problem

X = StandardScaler().fit_transform(X)


class CalibratedSVC(SVC):
    def __init__(self):
        super().__init__(probability=True, class_weight={0:0.9, 1:0.1})

    def predict(self, X):
        # binary classification with targets [0, 1] only
        y_proba = self.predict_proba(X)
        return np.argmax(y_proba, axis=1)


for est in [SVC(probability=False, class_weight={0:0.9, 1:0.1}), CalibratedSVC()]:
    est.fit(X, y)
    y_pred = est.predict(X)
    score = accuracy_score(y, y_pred)
    print(f'Model {est}')
    print(f'  - train accuracy={score:.6f}')

produces

Model SVC(class_weight={0: 0.9, 1: 0.1})
 - train accuracy=0.997222
Model CalibratedSVC()
 - train accuracy=1.000000

My best hypothesis at the moment is that .predict_proba() gives raw probabilities whereas .predict() actually takes into class_weights on the outputs of .predict_proba() to make the final inference.

Top Results From Across the Web

Scikit-learn predict_proba gives wrong answers - Stack Overflow

predict_probas is using the Platt scaling feature of libsvm to callibrate probabilities, see: How does sklearn.svm.svc's function ...

sklearn.svm.SVC — scikit-learn 1.2.0 documentation

The parameter is ignored for binary classification. ... 2]) >>> from sklearn.svm import SVC >>> clf = make_pipeline(StandardScaler(), ... predict_proba (X).

Can you interpret probabilistically the output of a Support ...

(E.g., in binary classification, a sample may be labeled by predict as belonging to a class that has probability <12 according to predict_proba...

A Gentle Introduction to Threshold-Moving for Imbalanced ...

First, let's fit a model and calculate a ROC Curve. We can use the make_classification() function to create a synthetic binary classification ......

Calibration using predict_proba vs class_weight

decision_function() method for your final classification. Finally, try not to over-optimize your classifier, because you can easily end up with a trivial const ......