question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

.predict_proba() for SVC produces incorrect results for binary classification

See original GitHub issue

Description

svm.predict_proba() produces revered results for binary classification

Steps/Code to Reproduce

here’s the code on colab: https://colab.research.google.com/github/qihongl/random/blob/master/sklearn-svm-predict-proba-bug.ipynb

# train a svm 
X = np.array([[-1, -1], [1, 1]])
y = np.array([0, 1])
svm = SVC(probability=True)
svm.fit(X, y) 

# SVM makes reasonable prediction on the learned examples... 
print(svm.predict(X))

# but it makes the reversed probability estimates... 
print(svm.predict_proba(X))

Expected Results

[0 1]
[[0.66383953 0.33616047]
 [0.33916469 0.66083531]]

# i.e. when the prediction is class 0, prob(class0) should be bigger than prob(class1)

Actual Results

[0 1]
[[0.33616047 0.66383953]
 [0.66083531 0.33916469]]

# i.e. when the prediction is class 0, prob(class0) < prob(class1)

Versions

0.20.3

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:22 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
amuellercommented, Apr 1, 2020

I came to the same conclusion in my analysis, and could also reproduce the same behavior with CalibratedClassifierCV when not using stratification. This is the evil of LOO in small noisy datasets.

So I agree with the closing this as a separate bug.

0reactions
123gkccommented, Mar 11, 2022

Reopening this as I ran into the same issue. According to my observations, the issue of .predict() and argmax of .predict_proba() not tallying only comes up when class_weights are enabled. Sharing modified code of @rth to recreate the issue :

import numpy as np

from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_digits
from sklearn.metrics import accuracy_score
from sklearn.svm import SVC

X, y = load_digits(return_X_y=True)

mask = (y == 0) | (y == 1)
X = X[mask, :]
y = y[mask]  # make a binary classification problem

X = StandardScaler().fit_transform(X)


class CalibratedSVC(SVC):
    def __init__(self):
        super().__init__(probability=True, class_weight={0:0.9, 1:0.1})

    def predict(self, X):
        # binary classification with targets [0, 1] only
        y_proba = self.predict_proba(X)
        return np.argmax(y_proba, axis=1)


for est in [SVC(probability=False, class_weight={0:0.9, 1:0.1}), CalibratedSVC()]:
    est.fit(X, y)
    y_pred = est.predict(X)
    score = accuracy_score(y, y_pred)
    print(f'Model {est}')
    print(f'  - train accuracy={score:.6f}')

produces

Model SVC(class_weight={0: 0.9, 1: 0.1})
 - train accuracy=0.997222
Model CalibratedSVC()
 - train accuracy=1.000000

My best hypothesis at the moment is that .predict_proba() gives raw probabilities whereas .predict() actually takes into class_weights on the outputs of .predict_proba() to make the final inference.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Scikit-learn predict_proba gives wrong answers - Stack Overflow
predict_probas is using the Platt scaling feature of libsvm to callibrate probabilities, see: How does sklearn.svm.svc's function ...
Read more >
sklearn.svm.SVC — scikit-learn 1.2.0 documentation
The parameter is ignored for binary classification. ... 2]) >>> from sklearn.svm import SVC >>> clf = make_pipeline(StandardScaler(), ... predict_proba (X).
Read more >
Can you interpret probabilistically the output of a Support ...
(E.g., in binary classification, a sample may be labeled by predict as belonging to a class that has probability <12 according to predict_proba...
Read more >
A Gentle Introduction to Threshold-Moving for Imbalanced ...
First, let's fit a model and calculate a ROC Curve. We can use the make_classification() function to create a synthetic binary classification ......
Read more >
Calibration using predict_proba vs class_weight
decision_function() method for your final classification. Finally, try not to over-optimize your classifier, because you can easily end up with a trivial const ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found