Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

When to use dual=False in LinearSVC? the document looks confusing

See original GitHub issue

https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/svm/_classes.py#L39

https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/svm/_classes.py#L41

The default value for dual is True, which should correspond to the case where n_samples > n_features (usual case). But the document also says Prefer dual=False when n_samples > n_features., which looks really confusing to me.

So, which case we should use dual=False?

Thanks!

Issue Analytics

State:
Created 3 years ago
Comments:6 (2 by maintainers)

Top GitHub Comments

3reactions

whyisyoungcommented, May 26, 2020

In the primal formulation of linear SVC (i.e dual = False ), the optimisation variable is of dimension n_features. Whereas in the dual formulation (i.e dual = True ), the variable is of dimension n_samples. More importantly, the dual formulation requires the computation of an n_samplesxn_samples matrix . For this reason, when n_samples > n_features it is better to use dual = False .

See here for more details.

I think for most of the datasets, n_samples > n_features is the usual case. Then why not to set dual=False as the default setting?

0reactions

whyisyoungcommented, Feb 1, 2021

Thanks @gkevinyen5418. That makes a lot of sense to me. I will change to use dual=False in my case where n_samples > n_features and close this issue.

Top Results From Across the Web

sklearn.svm.LinearSVC — scikit-learn 1.2.0 documentation

Implementation of Support Vector Machine classifier using libsvm: the kernel can be non-linear but its SMO algorithm does not scale to large number...

What does the "dual" parameter in sklearn.svm.LinearSVC ...

The mirror has two faces. If one looks into the mirror from one side, they would see meadows full of beautiful rainbow ponies...

Meaning of `penalty` and `loss` in LinearSVC - Stack Overflow

According to the doc, here's the considered primal optimization ... loss='hinge' , dual=False is not supported as specified in here (it is ...

Another Twitter sentiment analysis with Python - Part 5 (Tfidf ...

TFIDF is another way to convert textual data to numeric orm, and is short for Term Frequency-Inverse Document Frequency.

Classification Example with Linear SVC in Python

Next, we'll define the classifier by using the LinearSVC class. ... we'll check the accuracy level by using the confusion matrix function.