question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SVD inconsistency for sparse matrix with LOBPCG

See original GitHub issue

SVD inconsistency for sparse matrix with LOBPCG

Computing singular values using the solver LOBPCG, with a different number of singular values ask, is not stable in the sense that it produce different values.

To be more precise, take the sparse matrix:

import numpy as np
from scipy.sparse import diags
from scipy.sparse.linalg import svds

mat_size = 128
mat_diag = np.full(mat_size, -2)
mat_hors_diag = np.full(mat_size - 1, 1)
mat = diags((mat_hors_diag, mat_diag, mat_hors_diag), (-1, 0, 1))

Then, I will compute the 16 largest (LM) and smallest (SM) singular values of the matrix using either the solver arpack or lobpcg. I will do this two times by first asking for the 16 LM/SM and then for the 32 LM/SM singular values and compare the 16 LM/SM from these two computations. We should get that the difference between the two is close to machine precision.

First, as a reference, the 16 LM singular values with arpack:

>>> svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="arpack")
>>> svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="arpack")
>>> print(np.abs(svd_16-svd_32[-16:]).max())
3.552713678800501e-15

The maximum of the difference is close to machine precision, that is good.

Then the 16 LM singular values with lobpcg:

>>> svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="lobpcg")
>>> svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="lobpcg")
>>> print(np.abs(svd_16-svd_32[-16:]).max())
1.1664855303905597e-08

We have lost half the precision.

And the SM version with lobpcg:

>>> svd_16 = svds(mat, k=16, which="SM", return_singular_vectors=False, solver="lobpcg")
>>> svd_32 = svds(mat, k=32, which="SM", return_singular_vectors=False, solver="lobpcg")
>>> print(np.abs(svd_16-svd_32[:16]).max())
0.03882329852828505

It is even worse. I did not put the SM result with arpack as it does not converge in that case.

To track how this behaves with respect the matrix size, I made a plot of those maximum differences with respect to the matrix size (see svd_scipy.pdf). We can see that the LM Arpack stay close to machine precision that good. However, the LM/SM with LOBPCG curve seem good for small matrix size but at a matrix size of 80, they have a jump and a big one for the SM version.


Additional information:

  • This behavior does not seem to depend on the values in the matrix neither if they are real nor complex.
  • The matrix size for which the jump occurs depend on the number of singular values ask (here 16/32).
  • SciPy/NumPy/Python version information
    >>> import sys, scipy, numpy; print(scipy.__version__, numpy.__version__, sys.version_info)
    1.7.3 1.21.4 sys.version_info(major=3, minor=9, micro=5, releaselevel='final', serial=0)
    

Code for the plot:

import matplotlib.pyplot as plt
import numpy as np
from scipy.sparse import diags
from scipy.sparse.linalg import svds


def matrix(mat_size):
    mat_diag = np.full(mat_size, -2)
    mat_hors_diag = np.full(mat_size - 1, 1)
    return diags((mat_hors_diag, mat_diag, mat_hors_diag), (-1, 0, 1))


def LM_Arpack(mat_size):
    mat = matrix(mat_size)
    svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="arpack")
    svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="arpack")
    return np.abs(svd_16 - svd_32[-16:]).max()


def LM_LOBPCG(mat_size):
    mat = matrix(mat_size)
    svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="lobpcg")
    svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="lobpcg")
    return np.abs(svd_16 - svd_32[-16:]).max()


def SM_LOBPCG(mat_size):
    mat = matrix(mat_size)
    svd_16 = svds(mat, k=16, which="SM", return_singular_vectors=False, solver="lobpcg")
    svd_32 = svds(mat, k=32, which="SM", return_singular_vectors=False, solver="lobpcg")
    return np.abs(svd_16 - svd_32[:16]).max()


N = np.arange(33, 129)
lm_arpack = [LM_Arpack(mat_size) for mat_size in N]
lm_lobpcg = [LM_LOBPCG(mat_size) for mat_size in N]
sm_lobpcg = [SM_LOBPCG(mat_size) for mat_size in N]

fig, ax = plt.subplots(constrained_layout=True)

ax.semilogy(N, lm_arpack, ".--", label="LM Arpack")
ax.semilogy(N, lm_lobpcg, "+--", label="LM LOBPCG")
ax.semilogy(N, sm_lobpcg, "x--", label="SM LOBPCG")

ax.grid(True)
ax.set_xlabel("matrix size", fontsize=15)
ax.set_ylabel("difference", fontsize=15)
ax.legend(loc=2, fontsize=15)

plt.show()

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
zmoitiercommented, Apr 2, 2022

Yes, it does. Thanks for letting me know.

0reactions
lobpcgcommented, Apr 2, 2022

@zmoitier It appears that the issue has been resolved and could be closed?

Read more comments on GitHub >

github_iconTop Results From Across the Web

svds(solver='lobpcg') — SciPy v1.9.3 Manual
Partial singular value decomposition of a sparse matrix using LOBPCG. Compute the largest or smallest k singular values and corresponding singular vectors ...
Read more >
arXiv:1607.01404v2 [cs.MS] 24 Jan 2017
sparse matrix A ∈ Rm×n (m ≥ n). ... The Singular Value Decomposition (SVD) can always be computed [12]. ... JD(QMR)/GD+k/LOBPCG.
Read more >
[FEA] Add support for sparse matrices for TruncatedSVD for ...
I understand that cupy.svds and sklearn.TruncatedSVD use different solvers (LOBPCG and randomized by default respectively) and this is the ...
Read more >
Scipy Sparse Matrices - Inconsistent Sums - Stack Overflow
I have a sparse matrix that I arrived at through a complicated bunch of calculations which I cannot reproduce here. I will try...
Read more >
LARGE SCALE SPARSE SINGULAR VALUE COMPUTATIONS
decomposition (SVD) of large sparse matrices on a multiprocessor architecture. We particularly em-. phasize Lanczos and subspace iteration-based methods for ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found