question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

LinAlgError: the leading minor of order 11 of 'b' is not positive definite. The factorization of 'b' could not be completed and no eigenvalues or eigenvectors were computed.

See original GitHub issue

I’m looking at trying to use umap on whole-genome data from the 1000 genomes project. I’m not doing lots of preprocessing, just filtering out variants with minor allele frequency less than 10% and looking at a single chromosome (chr1); this gives me 509925 variants across 2504 individuals (a 509925x2504 matrix).

Here’s what I’m doing:

import resource, sys
sys.setrecursionlimit(10**6)
resource.setrlimit(resource.RLIMIT_STACK, (2**29,-1))
import umap
import allel
callset = allel.read_vcf('ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz')
gt = allel.GenotypeArray(callset['calldata/GT'])
allel_freqs = gt.count_alleles()
maf_1 = gt[((allel_freqs/float(gt.shape[1]*2))[:,1] > .1)]
reducer = umap.UMAP()
embedding = reducer.fit(maf_1[:,:,0])

(If any biologists are reading, I’m probably doing this filtering wrong but I think UMAP should still work in this case?)

Initially I was running into RecursionLimit issues, but those went away after increasing the recursion limit (first 3 lines above). However, after a couple hours I get the following traceback:

In [42]: embedding = reducer.fit(maf_1[:,:,0])
  /home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/umap_learn-0.3.6-py2.7.egg/umap/spectral.py:229: UserWarning: Embedding a total of 2215 separate connected components using meta-embedding (experimental)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-42-784bcd135a58> in <module>()
----> 1 embedding = reducer.fit(maf_1[:,:,0])

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/umap_learn-0.3.6-py2.7.egg/umap/umap_.pyc in fit(self, X, y)
   1534             self.metric,
   1535             self._metric_kwds,
-> 1536             self.verbose,
   1537         )
   1538

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/umap_learn-0.3.6-py2.7.egg/umap/umap_.pyc in simplicial_set_embedding(data, graph, n_components, initial_alpha, a, b, gamma, negative_sample_rate, n_epochs, init, random_state, metric, metric_kwds, verbose)
    939             random_state,
    940             metric=metric,
--> 941             metric_kwds=metric_kwds,
    942         )
    943         expansion = 10.0 / initialisation.max()

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/umap_learn-0.3.6-py2.7.egg/umap/spectral.pyc in spectral_layout(data, graph, dim, random_state, metric, metric_kwds)
    238             random_state,
    239             metric=metric,
--> 240             metric_kwds=metric_kwds,
    241         )
    242

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/umap_learn-0.3.6-py2.7.egg/umap/spectral.pyc in multi_component_layout(data, graph, n_components, component_labels, dim, random_state, metric, metric_kwds)
    120             dim,
    121             metric=metric,
--> 122             metric_kwds=metric_kwds,
    123         )
    124     else:

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/umap_learn-0.3.6-py2.7.egg/umap/spectral.pyc in component_layout(data, n_components, component_labels, dim, metric, metric_kwds)
     57     component_embedding = SpectralEmbedding(
     58         n_components=dim, affinity="precomputed"
---> 59     ).fit_transform(affinity_matrix)
     60     component_embedding /= component_embedding.max()
     61

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/sklearn/manifold/spectral_embedding_.pyc in fit_transform(self, X, y)
    546         X_new : array-like, shape (n_samples, n_components)
    547         """
--> 548         self.fit(X)
    549         return self.embedding_

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/sklearn/manifold/spectral_embedding_.pyc in fit(self, X, y)
    525                                              n_components=self.n_components,
    526                                              eigen_solver=self.eigen_solver,
--> 527                                              random_state=random_state)
    528         return self
    529

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/sklearn/manifold/spectral_embedding_.pyc in spectral_embedding(adjacency, n_components, eigen_solver, random_state, eigen_tol, norm_laplacian, drop_first)
    324             X[:, 0] = dd.ravel()
    325             lambdas, diffusion_map = lobpcg(laplacian, X, tol=1e-15,
--> 326                                             largest=False, maxiter=2000)
    327             embedding = diffusion_map.T[:n_components]
    328             if norm_laplacian:

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/scipy/sparse/linalg/eigen/lobpcg/lobpcg.pyc in lobpcg(A, X, B, M, Y, tol, maxiter, largest, verbosityLevel, retLambdaHistory, retResidualNormsHistory)
    488
    489         # Solve the generalized eigenvalue problem.
--> 490         _lambda, eigBlockVector = eigh(gramA, gramB, check_finite=False)
    491         ii = np.argsort(_lambda)[:sizeX]
    492         if largest:

/home/bowser/.virtualenvs/genetics/local/lib/python2.7/site-packages/scipy/linalg/decomp.pyc in eigh(a, b, lower, eigvals_only, overwrite_a, overwrite_b, turbo, eigvals, type, check_finite)
    491                           " factorization of 'b' could not be completed"
    492                           " and no eigenvalues or eigenvectors were"
--> 493                           " computed." % (info-b1.shape[0]))
    494
    495

LinAlgError: the leading minor of order 11 of 'b' is not positive definite. The factorization of 'b' could not be completed and no eigenvalues or eigenvectors were computed.

Any idea what’s going wrong here? If you want to reproduce, the VCF can be found here:

http://hgdownload.cse.ucsc.edu/gbdb/hg19/1000Genomes/phase3/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz

The only other dependency aside from UMAP is scikit-allel.

Versions of various things:

In [46]: print umap.__version__    
0.3.6

In [47]: print allel.__version__   
1.1.10

In [48]: import sklearn

In [49]: print sklearn.__version__
0.20.0

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
lobpcgcommented, Dec 10, 2018

Thanks for the suggestion @lobpcg; this is not my area of expertise, so do you have some suggested default values that should be good?

The optimal tolerance is difficult to predict theoretically, since it is problem dependent. Practically speaking, just try setting tol=1e-15 larger and/or maxiter=2000 smaller and see what happens… You need to get some practical experience yourself for your data finding the best balance keeping accuracy good enough while cutting the compute time, if your code runs for too long with the default values.

I’ve seen examples where even maxiter=2 would be enough for the overall goal. Have fun!

0reactions
lmcinnescommented, Dec 10, 2018

Thanks for the suggestion @lobpcg; this is not my area of expertise, so do you have some suggested default values that should be good?

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - The factorization of B could not be completed and no ...
Using the linalg library on python (Spyder4) and I got an error saying: LinAlgError: The leading minor of order 12 of B is...
Read more >
CSP error: LinAlgError: The leading minor of order 64 of B is ...
The factorization of B could not be completed and no eigenvalues or eigenvectors were computed. There seems to be no structure in when...
Read more >
"the leading minor of order 1 is not positive definite" error ...
In my case the model was overfitted. One way to solve this issue is by adjusting the predictor matrix of MICE. You may...
Read more >
SOLVED: Linalgerror: the leading minor of order 240 of 'b' is not ...
Therefore we have that A plus B is also positive definite. ... factorization of 'b' could not be completed and no eigenvalues or...
Read more >
sygvd — oneAPI Specification 1.1-rev-1 documentation
The factorization of B could not be completed and no eigenvalues or eigenvectors were computed. If info equals to value passed as scratchpad...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found