Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve MinCovDet.fit error when covariance is zero

See original GitHub issue

Ok this is extremely weird, can someone run this code and see if it crashes with that error?

import numpy as np

from sklearn.covariance import MinCovDet

clf = MinCovDet()

data = np.array([0.5, 0.1, 0.1, 0.1, 0.957, 0.1, 0.1,
                 0.1, 0.4285, 0.1]).reshape(-1, 1)
clf.fit(data)

If I change the array to this

data = np.array([0.5, 0.11, 0.1, 0.1, 0.957, 0.1, 0.1, 
                 0.1, 0.4285, 0.1]).reshape(-1, 1)

Then it runs fine

But it seems to crash with any array where there are too many of the same values

This array crashes as well

data = np.array([0.5, 0.3, 0.3, 0.3, 0.957, 0.3, 0.3, 
                 0.3, 0.4285, 0.3]).reshape(-1, 1)

I already checked for NANs and everything, there’s nothing

Using Python 3.6.2 Pandas 0.20.3 Numpy 1.13.1 scikit-learn 0.19.0

Thanks

Issue Analytics

State:
Created 6 years ago
Comments:10 (7 by maintainers)

Top GitHub Comments

1reaction

albertcthomascommented, Oct 10, 2017

In this case, the estimated covariance matrix of the support data is equal to 0 and therefore the determinant is equal to 0. We thus have the minimum covariance Determinant and the algorithm should stop as explained in the original paper. The covariance is equal to 0 in your case because the support data are the ones with the same values… If there are fewer ties you have different values in the support data and everything works fine. If you don’t want the robust covariance to be estimated by 0 you may want to increase the support_fraction parameter to increase the number of support data.

0reactions

lestevecommented, Oct 11, 2017

To be honest, just saying “det(cov) = 0, try to increase support_fraction” may be a good enough error message. PR more than welcome!

Top Results From Across the Web

sklearn.covariance.MinCovDet

Compute the Mean Squared Error between two covariance estimators. fit (X[, y]). Fit a Minimum Covariance Determinant with the FastMCD algorithm.

2.6. Covariance Estimation - Scikit-learn - W3cubDocs

In their 2004 paper [1], O. Ledoit and M. Wolf propose a formula to compute the optimal shrinkage coefficient \(\alpha\) that minimizes the...

sklearn.covariance.MinCovDet.fit Example - Program Talk

def test_mcd_support_covariance_is_zero(): # Check that MCD returns a ValueError with informative message when the # covariance of the support data is equal ...

scikit-learn minCovDet Input contains NaN, infinity or a value ...

When I run your code, it gives me a Runtime Warning: divide by zero encountered in true_divide and another one RuntimeWarning: invalid value ......

Outlier Detection — Applied Machine Learning in Python

Fit robust covariance matrix and mean FIXME add slide on Covariance: ... req, **http_conn_args) 1318 except OSError as err: # timeout error 1319...