# [FEATURE] Bayesian Kernel Density Classifier

See original GitHub issue- I’ve been using this Bayesian kernel density classifier for a few years and I thought I should move it out from my poorly organized project to this one here.

The prior is $P(y=0)$. I primarily use it for spatial problems.

It is similar to the GMM Classifier with only 2 caveats I can think of.

- Hyperparameters are easier to decide on.
- Scaling is worse as I believe due to the KDE part scaling linearly with the sample size.

```
# noinspection PyPep8Naming
class BayesianKernelDensityClassifier(BaseEstimator, ClassifierMixin):
"""
Bayesian Classifier that uses Kernel Density Estimations to generate the joint distribution
Parameters:
- bandwidth: float
- kernel: for scikit learn KernelDensity
"""
def __init__(self, bandwidth=0.2, kernel='gaussian'):
self.classes_, self.models_, self.priors_logp_ = [None] * 3
self.bandwidth = bandwidth
self.kernel = kernel
def fit(self, X, y):
self.classes_ = np.sort(np.unique(y))
training_sets = [X[y == yi] for yi in self.classes_]
self.models_ = [KernelDensity(bandwidth=self.bandwidth, kernel=self.kernel).fit(x_subset)
for x_subset in training_sets]
self.priors_logp_ = [np.log(x_subset.shape[0] / X.shape[0]) for x_subset in training_sets]
return self
def predict_proba(self, X):
logp = np.array([model.score_samples(X) for model in self.models_]).T
result = np.exp(logp + self.priors_logp_)
return result / result.sum(1, keepdims=True)
def predict(self, X):
return self.classes_[np.argmax(self.predict_proba(X), 1)]
```

I don’t know precisely for sklearn either but I figured since the expectation step has to perform a dot product as the most expensive step to compute the expectation that would be p^3 (and apparently ~p^2.3). It should also scale linearly with the number of clusters.

(Not an academic level citation I know but https://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations#Matrix_algebra)

Whoops sorry.

https://colab.research.google.com/drive/12z28LCt2Y76smB01w2QizK3_cCo-6y2G

I never noticed how bad scaling is on larger than of the academic datasets and that actually has been a big thing. I’ve used it on 2D species distributional data (I’m trying to see if I am allowed to use that dataset for the notebook).