question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

When to use dual=False in LinearSVC? the document looks confusing

See original GitHub issue

https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/svm/_classes.py#L39

https://github.com/scikit-learn/scikit-learn/blob/fd237278e895b42abe8d8d09105cbb82dc2cbba7/sklearn/svm/_classes.py#L41

The default value for dual is True, which should correspond to the case where n_samples > n_features (usual case). But the document also says Prefer dual=False when n_samples > n_features., which looks really confusing to me.

So, which case we should use dual=False?

Thanks!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
whyisyoungcommented, May 26, 2020

In the primal formulation of linear SVC (i.e dual = False ), the optimisation variable is of dimension n_features. Whereas in the dual formulation (i.e dual = True ), the variable is of dimension n_samples. More importantly, the dual formulation requires the computation of an n_samplesxn_samples matrix . For this reason, when n_samples > n_features it is better to use dual = False .

See here for more details.

I think for most of the datasets, n_samples > n_features is the usual case. Then why not to set dual=False as the default setting?

0reactions
whyisyoungcommented, Feb 1, 2021

Thanks @gkevinyen5418. That makes a lot of sense to me. I will change to use dual=False in my case where n_samples > n_features and close this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn.svm.LinearSVC — scikit-learn 1.2.0 documentation
Implementation of Support Vector Machine classifier using libsvm: the kernel can be non-linear but its SMO algorithm does not scale to large number...
Read more >
What does the "dual" parameter in sklearn.svm.LinearSVC ...
The mirror has two faces. If one looks into the mirror from one side, they would see meadows full of beautiful rainbow ponies...
Read more >
Meaning of `penalty` and `loss` in LinearSVC - Stack Overflow
According to the doc, here's the considered primal optimization ... loss='hinge' , dual=False is not supported as specified in here (it is ...
Read more >
Another Twitter sentiment analysis with Python - Part 5 (Tfidf ...
TFIDF is another way to convert textual data to numeric orm, and is short for Term Frequency-Inverse Document Frequency.
Read more >
Classification Example with Linear SVC in Python
Next, we'll define the classifier by using the LinearSVC class. ... we'll check the accuracy level by using the confusion matrix function.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found