question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Naive Bayes Classifier with Mixed Bernoulli/Gaussian Models

See original GitHub issue

Description

I suggest allowing mixed datasets (half binary variables, half real-valued variables) into the Naive Bayes classifier. Currently the GaussianNB and BernoulliNB classes handle one case or the other, but not combined. I’d be happy to write the code for this, so I’m curious if this has been explored before and if it would be helpful!

For example, on the Titanic dataset, gender is a bernoulli variable while age is real-valued. Passing both into a Naive Bayes classifier would improve it.

This is related to this currently pending PR: https://github.com/scikit-learn/scikit-learn/pull/12569

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:9
  • Comments:8 (6 by maintainers)

github_iconTop GitHub Comments

2reactions
gautam-ecommented, Nov 11, 2021

I think a Mixed / General Naive Bayes classifier that allows one to mix and match the already available sklearn implementations of Naive Bayes with categorical and continuous columns is a use case that has been left unattended. It has been asked several times on Stackoverflow etc. with mostly pretty unsatisfactory answers. I have already derived and written some code that does this but have never made a PR before. Would be happy to team up on this @jarednielsen

1reaction
FlorianWilhelmcommented, May 31, 2019

@jarednielsen Are you working on this? As @jnothman already said, you could also start to work on a GeneralNB even if the CategoricalNB is not merged right now. Let @timbicker and me know if you need any pointers or want to discuss something. We would like to support this endeavour.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Naive-Bayes for mixed typed data in scikit-learn - Medium
Gaussian NB assumes your data to be independent and normally distributed, similarly Multinomial and categorical where the distribution is the ...
Read more >
Is it possible to mix different variable types in Naive Bayes, for ...
The Gaussian Model. Typically, we use the Gaussian Naive Bayes model for variables on a continuous scale – assuming that our variables are...
Read more >
Implementing 3 Naive Bayes classifiers in scikit-learn
Scikit-learn provide three naive Bayes classifiers: Bernoulli, multinomial and Gaussian. The only difference is about the probability ...
Read more >
Fast Naive Bayes
This package is currently the only package that supports a Bernoulli distribution, a Multinomial distribution, and a Gaussian distribution, ...
Read more >
Naive Bayes Classifier — How to Successfully Use It in Python?
The category of algorithms that Naive Bayes classifier belongs to ... Mixed NB (Gaussian + Categorical) approach 2 — train two separate models...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found