gamma='scale' in SVC
See original GitHub issueI believe that setting gamma='scale'
in SVC
is not meeting its intended purpose of being invariant to the scale of X
. Currently, gamma
is set to 1 / (n_features * X.std())
. However, I believe it should be 1 / (n_features * X.var())
.
Rationale: if you scale X
by 10 you need to scale gamma
by 1/100, not 1/10, to achieve the same results. See the definition of the RBF kernel here: the “units” of gamma
are 1/x^2, not 1/x.
I also tested this empirically: scaling X
by 10 and scaling gamma
by 1/100 gives the same result as the original, whereas scaling X
by 10 and scaling gamma
by 1/10 gives a different result. Here is some code:
import numpy as np
from sklearn.svm import SVC
X = np.random.rand(100,10)
y = np.random.choice(2,size=100)
svm = SVC(gamma=1)
svm.fit(X,y)
print(svm.decision_function(X[:5]))
# scale X by 10, gamma by 1/100
svm = SVC(gamma=0.01)
svm.fit(10*X,y)
print(svm.decision_function(10*X[:5])) # prints same result
# scale X by 10, gamma by 1/10
svm = SVC(gamma=0.1)
svm.fit(10*X,y)
print(svm.decision_function(10*X[:5])) # prints different result
Note that gamma='scale'
will become the default setting for gamma
in version 0.22.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:5
- Comments:13 (5 by maintainers)
Top Results From Across the Web
Why SVM with gamma='scale' for RBF kernel works so well?
SVC the default value of the parameter gamma is 'scale' , i.e. gamma = 1 / (n_features * X.var()) . What is the...
Read more >Default value of gamma SVC sklearn - python - Stack Overflow
This is easy to see with an example. The array X below has two features (columns). The variance of the array is 1.75....
Read more >sklearn.svm.SVC — scikit-learn 1.2.0 documentation
Scalable Linear Support Vector Machine for classification implemented using liblinear. Check the See Also section of LinearSVC for more comparison element.
Read more >Why SVM with gamma='scale' for RBF kernel works so well?
SVC the default value of the parameter gamma is 'scale', i.e. gamma = 1 / (n_features * X.var()). What is the explanation for...
Read more >Scikit Learn - Support Vector Machines - Tutorialspoint
Scikit-learn provides three classes namely SVC, NuSVC and LinearSVC which can ... default i.e. gamma = 'scale' then the value of gamma to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@chf42 if I go through the documentation here, https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC gamma = ‘scale’ should be defined as 1/(X_train.shape[1]*np.array(X_train).var()) since X_train.shape is usually n_samples x n_features as far as I can tell.
I’m really sorry to open this after 3 years, but I am trying to understand why gamma scale is the calculated the way it is. Does anyone have any work explaining this?