question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Improve MinCovDet.fit error when covariance is zero

See original GitHub issue

Ok this is extremely weird, can someone run this code and see if it crashes with that error?

import numpy as np

from sklearn.covariance import MinCovDet

clf = MinCovDet()

data = np.array([0.5, 0.1, 0.1, 0.1, 0.957, 0.1, 0.1,
                 0.1, 0.4285, 0.1]).reshape(-1, 1)
clf.fit(data)

If I change the array to this

data = np.array([0.5, 0.11, 0.1, 0.1, 0.957, 0.1, 0.1, 
                 0.1, 0.4285, 0.1]).reshape(-1, 1)

Then it runs fine

But it seems to crash with any array where there are too many of the same values

This array crashes as well

data = np.array([0.5, 0.3, 0.3, 0.3, 0.957, 0.3, 0.3, 
                 0.3, 0.4285, 0.3]).reshape(-1, 1)

I already checked for NANs and everything, there’s nothing

Using Python 3.6.2 Pandas 0.20.3 Numpy 1.13.1 scikit-learn 0.19.0

Thanks

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:10 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
albertcthomascommented, Oct 10, 2017

In this case, the estimated covariance matrix of the support data is equal to 0 and therefore the determinant is equal to 0. We thus have the minimum covariance Determinant and the algorithm should stop as explained in the original paper. The covariance is equal to 0 in your case because the support data are the ones with the same values… If there are fewer ties you have different values in the support data and everything works fine. If you don’t want the robust covariance to be estimated by 0 you may want to increase the support_fraction parameter to increase the number of support data.

0reactions
lestevecommented, Oct 11, 2017

To be honest, just saying “det(cov) = 0, try to increase support_fraction” may be a good enough error message. PR more than welcome!

Read more comments on GitHub >

github_iconTop Results From Across the Web

sklearn.covariance.MinCovDet
Compute the Mean Squared Error between two covariance estimators. fit (X[, y]). Fit a Minimum Covariance Determinant with the FastMCD algorithm.
Read more >
2.6. Covariance Estimation - Scikit-learn - W3cubDocs
In their 2004 paper [1], O. Ledoit and M. Wolf propose a formula to compute the optimal shrinkage coefficient \(\alpha\) that minimizes the...
Read more >
sklearn.covariance.MinCovDet.fit Example - Program Talk
def test_mcd_support_covariance_is_zero(): # Check that MCD returns a ValueError with informative message when the # covariance of the support data is equal ...
Read more >
scikit-learn minCovDet Input contains NaN, infinity or a value ...
When I run your code, it gives me a Runtime Warning: divide by zero encountered in true_divide and another one RuntimeWarning: invalid value ......
Read more >
Outlier Detection — Applied Machine Learning in Python
Fit robust covariance matrix and mean FIXME add slide on Covariance: ... req, **http_conn_args) 1318 except OSError as err: # timeout error 1319...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found