question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

problem in clf.coef_ for MultinomialNB

See original GitHub issue

Description

if MultinomialNB there is strange behavior of clf.coef_: clf.coef_ is the same as clf.feature_log_prob_[1]

and

clf.intercept_ is the same as only one clf.class_log_prior_

for example clf.feature_log_prob_[0][0:3]

array([-3.63942161, -3.17296199, -4.59417863])

clf.feature_log_prob_[1][0:3]

array([-3.51935008, -3.010937 , -6.41836494])

clf.coef_[0][0:3]

array([-3.51935008, -3.010937 , -6.41836494])

https://scikit-learn.org/stable/modules/generated/sklearn.naive_bayes.MultinomialNB.html#sklearn.naive_bayes.MultinomialNB

for your code example X = np.random.randint(5, size=(6, 100))

#y = np.array([1, 2, 3, 4, 5, 6])

y = np.array([1, 2, 1, 1, 1, 1])

#y = np.array([1, 2, 3, 2, 1, 2])

from sklearn.naive_bayes import MultinomialNB

clf = MultinomialNB(alpha=12.0)

clf.fit(X, y)

#MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)

print(clf.predict(X[2:3]))

clf.feature_count_

clf.feature_log_prob_

clf.class_log_prior_

clf.class_count_

clf.coef_

[1]

clf.coef_ - clf.feature_log_prob_[1]

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

    0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

    0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

    0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

    0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

    0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,

    0., 0., 0., 0.]])

clf.coef_

array([[-4.48863637, -4.55317489, -4.55317489, -4.62216776, -4.55317489,

    -4.77631844, -4.62216776, -4.62216776, -4.62216776, -4.55317489,

    -4.69627573, -4.48863637, -4.69627573, -4.69627573, -4.69627573,

    -4.55317489, -4.77631844, -4.69627573, -4.69627573, -4.48863637,

    -4.48863637, -4.55317489, -4.77631844, -4.48863637, -4.62216776,

    -4.48863637, -4.62216776, -4.69627573, -4.48863637, -4.69627573,

    -4.77631844, -4.55317489, -4.62216776, -4.62216776, -4.55317489,

    -4.55317489, -4.48863637, -4.69627573, -4.77631844, -4.77631844,

    -4.48863637, -4.62216776, -4.77631844, -4.55317489, -4.62216776,

    -4.48863637, -4.69627573, -4.62216776, -4.55317489, -4.48863637,

    -4.77631844, -4.48863637, -4.69627573, -4.55317489, -4.48863637,

    -4.48863637, -4.69627573, -4.62216776, -4.62216776, -4.48863637,

    -4.55317489, -4.55317489, -4.55317489, -4.48863637, -4.55317489,

    -4.62216776, -4.62216776, -4.77631844, -4.55317489, -4.77631844,

   -4.55317489, -4.69627573, -4.48863637, -4.55317489, -4.48863637,

    -4.77631844, -4.77631844, -4.77631844, -4.62216776, -4.48863637,

    -4.77631844, -4.55317489, -4.48863637, -4.69627573, -4.48863637,

    -4.62216776, -4.62216776, -4.48863637, -4.55317489, -4.69627573,

    -4.55317489, -4.77631844, -4.55317489, -4.62216776, -4.48863637,

    -4.62216776, -4.55317489, -4.77631844, -4.48863637, -4.69627573]])

clf.intercept_

array([-1.79175947])

clf.class_log_prior_

array([-0.18232156, -1.79175947])

Steps/Code to Reproduce

Expected Results

Actual Results

Versions

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
NicolasHugcommented, Feb 19, 2020

I don’t believe @Sandy4321 has opened a PR so go for it @AntonPeniaziev

0reactions
cmarmocommented, Feb 2, 2022

Indeed, coef_ and intercept_ will be removed in the next release. I’m closing this issue. Thanks @shenoy-anurag for checking.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn naive bayes MultinomialNB: Why do I get only one ...
The problem with MultinomialNB is that it is not a linear classifier and actually does not compute coefficients to determine a decision ......
Read more >
sklearn.naive_bayes.MultinomialNB
The multinomial Naive Bayes classifier is suitable for classification with discrete features (e.g., word counts for text classification). The multinomial ...
Read more >
Problems obtaining most informative features with scikit learn?
Im triying to obtain the most informative features from a textual corpus. From this well ... the features with the highest coefficient ......
Read more >
Notes on Multinomial Naive Bayes | Analytics with Python
This note presents a short derivation of the Multinomial Naive Bayes classifier, and shows how to interpret the coefficients and how to ...
Read more >
Naive Bayes questions: continus data, negative data, and ...
As for trying this out in scikit-learn, when I pass the data to be clf.fit() by MultinomialNB , I get ValueError: Input X...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found