problem in clf.coef_ for MultinomialNB
See original GitHub issueDescription
if MultinomialNB there is strange behavior of clf.coef_: clf.coef_ is the same as clf.feature_log_prob_[1]
and
clf.intercept_ is the same as only one clf.class_log_prior_
for example clf.feature_log_prob_[0][0:3]
array([-3.63942161, -3.17296199, -4.59417863])
clf.feature_log_prob_[1][0:3]
array([-3.51935008, -3.010937 , -6.41836494])
clf.coef_[0][0:3]
array([-3.51935008, -3.010937 , -6.41836494])
for your code example X = np.random.randint(5, size=(6, 100))
#y = np.array([1, 2, 3, 4, 5, 6])
y = np.array([1, 2, 1, 1, 1, 1])
#y = np.array([1, 2, 3, 2, 1, 2])
from sklearn.naive_bayes import MultinomialNB
clf = MultinomialNB(alpha=12.0)
clf.fit(X, y)
#MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)
print(clf.predict(X[2:3]))
clf.feature_count_
clf.feature_log_prob_
clf.class_log_prior_
clf.class_count_
clf.coef_
[1]
clf.coef_ - clf.feature_log_prob_[1]
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0.]])
clf.coef_
array([[-4.48863637, -4.55317489, -4.55317489, -4.62216776, -4.55317489,
-4.77631844, -4.62216776, -4.62216776, -4.62216776, -4.55317489,
-4.69627573, -4.48863637, -4.69627573, -4.69627573, -4.69627573,
-4.55317489, -4.77631844, -4.69627573, -4.69627573, -4.48863637,
-4.48863637, -4.55317489, -4.77631844, -4.48863637, -4.62216776,
-4.48863637, -4.62216776, -4.69627573, -4.48863637, -4.69627573,
-4.77631844, -4.55317489, -4.62216776, -4.62216776, -4.55317489,
-4.55317489, -4.48863637, -4.69627573, -4.77631844, -4.77631844,
-4.48863637, -4.62216776, -4.77631844, -4.55317489, -4.62216776,
-4.48863637, -4.69627573, -4.62216776, -4.55317489, -4.48863637,
-4.77631844, -4.48863637, -4.69627573, -4.55317489, -4.48863637,
-4.48863637, -4.69627573, -4.62216776, -4.62216776, -4.48863637,
-4.55317489, -4.55317489, -4.55317489, -4.48863637, -4.55317489,
-4.62216776, -4.62216776, -4.77631844, -4.55317489, -4.77631844,
-4.55317489, -4.69627573, -4.48863637, -4.55317489, -4.48863637,
-4.77631844, -4.77631844, -4.77631844, -4.62216776, -4.48863637,
-4.77631844, -4.55317489, -4.48863637, -4.69627573, -4.48863637,
-4.62216776, -4.62216776, -4.48863637, -4.55317489, -4.69627573,
-4.55317489, -4.77631844, -4.55317489, -4.62216776, -4.48863637,
-4.62216776, -4.55317489, -4.77631844, -4.48863637, -4.69627573]])
clf.intercept_
array([-1.79175947])
clf.class_log_prior_
array([-0.18232156, -1.79175947])
Steps/Code to Reproduce
Expected Results
Actual Results
Versions
Issue Analytics
- State:
- Created 4 years ago
- Comments:11 (5 by maintainers)
I don’t believe @Sandy4321 has opened a PR so go for it @AntonPeniaziev
Indeed,
coef_
andintercept_
will be removed in the next release. I’m closing this issue. Thanks @shenoy-anurag for checking.