Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

SVD inconsistency for sparse matrix with LOBPCG

See original GitHub issue

SVD inconsistency for sparse matrix with LOBPCG

Computing singular values using the solver LOBPCG, with a different number of singular values ask, is not stable in the sense that it produce different values.

To be more precise, take the sparse matrix:

import numpy as np
from scipy.sparse import diags
from scipy.sparse.linalg import svds

mat_size = 128
mat_diag = np.full(mat_size, -2)
mat_hors_diag = np.full(mat_size - 1, 1)
mat = diags((mat_hors_diag, mat_diag, mat_hors_diag), (-1, 0, 1))

Then, I will compute the 16 largest (LM) and smallest (SM) singular values of the matrix using either the solver arpack or lobpcg. I will do this two times by first asking for the 16 LM/SM and then for the 32 LM/SM singular values and compare the 16 LM/SM from these two computations. We should get that the difference between the two is close to machine precision.

First, as a reference, the 16 LM singular values with arpack:

>>> svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="arpack")
>>> svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="arpack")
>>> print(np.abs(svd_16-svd_32[-16:]).max())
3.552713678800501e-15

The maximum of the difference is close to machine precision, that is good.

Then the 16 LM singular values with lobpcg:

>>> svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="lobpcg")
>>> svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="lobpcg")
>>> print(np.abs(svd_16-svd_32[-16:]).max())
1.1664855303905597e-08

We have lost half the precision.

And the SM version with lobpcg:

>>> svd_16 = svds(mat, k=16, which="SM", return_singular_vectors=False, solver="lobpcg")
>>> svd_32 = svds(mat, k=32, which="SM", return_singular_vectors=False, solver="lobpcg")
>>> print(np.abs(svd_16-svd_32[:16]).max())
0.03882329852828505

It is even worse. I did not put the SM result with arpack as it does not converge in that case.

To track how this behaves with respect the matrix size, I made a plot of those maximum differences with respect to the matrix size (see svd_scipy.pdf). We can see that the LM Arpack stay close to machine precision that good. However, the LM/SM with LOBPCG curve seem good for small matrix size but at a matrix size of 80, they have a jump and a big one for the SM version.

Additional information:

This behavior does not seem to depend on the values in the matrix neither if they are real nor complex.
The matrix size for which the jump occurs depend on the number of singular values ask (here 16/32).

SciPy/NumPy/Python version information

>>> import sys, scipy, numpy; print(scipy.__version__, numpy.__version__, sys.version_info)
1.7.3 1.21.4 sys.version_info(major=3, minor=9, micro=5, releaselevel='final', serial=0)

Code for the plot:

import matplotlib.pyplot as plt
import numpy as np
from scipy.sparse import diags
from scipy.sparse.linalg import svds


def matrix(mat_size):
    mat_diag = np.full(mat_size, -2)
    mat_hors_diag = np.full(mat_size - 1, 1)
    return diags((mat_hors_diag, mat_diag, mat_hors_diag), (-1, 0, 1))


def LM_Arpack(mat_size):
    mat = matrix(mat_size)
    svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="arpack")
    svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="arpack")
    return np.abs(svd_16 - svd_32[-16:]).max()


def LM_LOBPCG(mat_size):
    mat = matrix(mat_size)
    svd_16 = svds(mat, k=16, which="LM", return_singular_vectors=False, solver="lobpcg")
    svd_32 = svds(mat, k=32, which="LM", return_singular_vectors=False, solver="lobpcg")
    return np.abs(svd_16 - svd_32[-16:]).max()


def SM_LOBPCG(mat_size):
    mat = matrix(mat_size)
    svd_16 = svds(mat, k=16, which="SM", return_singular_vectors=False, solver="lobpcg")
    svd_32 = svds(mat, k=32, which="SM", return_singular_vectors=False, solver="lobpcg")
    return np.abs(svd_16 - svd_32[:16]).max()


N = np.arange(33, 129)
lm_arpack = [LM_Arpack(mat_size) for mat_size in N]
lm_lobpcg = [LM_LOBPCG(mat_size) for mat_size in N]
sm_lobpcg = [SM_LOBPCG(mat_size) for mat_size in N]

fig, ax = plt.subplots(constrained_layout=True)

ax.semilogy(N, lm_arpack, ".--", label="LM Arpack")
ax.semilogy(N, lm_lobpcg, "+--", label="LM LOBPCG")
ax.semilogy(N, sm_lobpcg, "x--", label="SM LOBPCG")

ax.grid(True)
ax.set_xlabel("matrix size", fontsize=15)
ax.set_ylabel("difference", fontsize=15)
ax.legend(loc=2, fontsize=15)

plt.show()

Issue Analytics

State:
Created 2 years ago
Comments:8 (4 by maintainers)

Top GitHub Comments

1reaction

zmoitiercommented, Apr 2, 2022

Yes, it does. Thanks for letting me know.

0reactions

lobpcgcommented, Apr 2, 2022

@zmoitier It appears that the issue has been resolved and could be closed?

Top Results From Across the Web

svds(solver='lobpcg') — SciPy v1.9.3 Manual

Partial singular value decomposition of a sparse matrix using LOBPCG. Compute the largest or smallest k singular values and corresponding singular vectors ...

arXiv:1607.01404v2 [cs.MS] 24 Jan 2017

sparse matrix A ∈ Rm×n (m ≥ n). ... The Singular Value Decomposition (SVD) can always be computed [12]. ... JD(QMR)/GD+k/LOBPCG.

[FEA] Add support for sparse matrices for TruncatedSVD for ...

I understand that cupy.svds and sklearn.TruncatedSVD use different solvers (LOBPCG and randomized by default respectively) and this is the ...

Scipy Sparse Matrices - Inconsistent Sums - Stack Overflow

I have a sparse matrix that I arrived at through a complicated bunch of calculations which I cannot reproduce here. I will try...

LARGE SCALE SPARSE SINGULAR VALUE COMPUTATIONS

decomposition (SVD) of large sparse matrices on a multiprocessor architecture. We particularly em-. phasize Lanczos and subspace iteration-based methods for ...