question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

sparse SVD cutoff

See original GitHub issue

Hello,

scipy.sparse.linalg.svds introduces a cutoff in singular values. Values smaller than eps*f*largest_singular_value are replaced by zero, where eps is the machine precision value of the datatype of the input and f is 1e3 for single precision and 1e6 for double precision (hence cond = 2.220446049250313e-10 for double and cond = 0.00011920928955078125 for single precision). This is done at https://github.com/scipy/scipy/blob/v1.5.2/scipy/sparse/linalg/eigen/arpack/arpack.py#L1875-L1879.

First problem is this feature is not documented. Second is these values are not that small and are hard-coded: it is not possible to change them. They appear to come from scipy.linalg.pinvh, however there the cutoff is documented and can be specified. There is a tol argument in the API that is used to compute the eigenvalues of A.H @ A but has no effect on the cutoff.

I think tol could also be used as a cutoff for the singular values as well as a parameter for eigsh, or else a cond keyword argument could be added.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:25 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
evgueni-ovtchinnikovcommented, May 29, 2022

@mgoldeli @lobpcg the time increase is substantial because the matrix Data in Issue_svds.zip is too small - just 4000 by 24.

With such a small size, there is hardly any point in using svds rather that much more robust and accurate svd - on my laptop 1000 svds calls took 2.14 sec and 1000 svd calls 2.7 sec.

Actually, svds becomes much more efficient than svd if min(data.shape) >> k, in which case the extra cost of the new more reliable svds becomes negligible.

0reactions
ogauthecommented, May 30, 2022

For what it’s worth, I have been running a similar post-processing for some time for reasonably large dense matrices. A typical run computes k=100 eigenvectors out of ncv=300 generated Lanczos vectors of a (20k, 20k) matrix with eigsh, then calls svd(A @ eigvec).

Time spent in scipy.linalg.svd is less than 0.2% of the time spent in eigsh, so in such a case the additional cost is negligible.

Read more comments on GitHub >

github_iconTop Results From Across the Web

sparsesvd: Sparse Truncated Singular Value Decomposition ...
Compute the (usually truncated) singular value decomposition (SVD) of a sparse real matrix. This function is a shallow wrapper around the ...
Read more >
A review on the selection criteria for the truncated SVD in Data ...
Selecting the cutoff value k defines the central model order selection problem of the truncated SVD. In the following section we revise the...
Read more >
Efficient Algorithms for Sparse Singular Value Decomposition
The theory and algorithms for sparse singular value decomposition, ... There can be a cut-off point where 64-by-64 may start performing b ...
Read more >
MATLAB svd - Singular value decomposition - MathWorks
This MATLAB function returns the singular values of matrix A in descending order.
Read more >
svd — TeNPy 0.5.0.dev67+ebf8548 documentation
tenpy.linalg.np_conserved. svd (a, full_matrices=False, compute_uv=True, cutoff=None, qtotal_LR=[None, None], inner_labels=[None, None], ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found