sparse SVD cutoff
See original GitHub issueHello,
scipy.sparse.linalg.svds introduces a cutoff in singular values. Values smaller than eps*f*largest_singular_value are replaced by zero, where eps is the machine precision value of the datatype of the input and f is 1e3 for single precision and 1e6 for double precision (hence cond = 2.220446049250313e-10 for double and cond = 0.00011920928955078125 for single precision). This is done at https://github.com/scipy/scipy/blob/v1.5.2/scipy/sparse/linalg/eigen/arpack/arpack.py#L1875-L1879.
First problem is this feature is not documented. Second is these values are not that small and are hard-coded: it is not possible to change them. They appear to come from scipy.linalg.pinvh, however there the cutoff is documented and can be specified. There is a tol argument in the API that is used to compute the eigenvalues of A.H @ A but has no effect on the cutoff.
I think tol could also be used as a cutoff for the singular values as well as a parameter for eigsh, or else a cond keyword argument could be added.
Issue Analytics
- State:
- Created 3 years ago
- Comments:25 (14 by maintainers)

Top Related StackOverflow Question
@mgoldeli @lobpcg the time increase is substantial because the matrix
DatainIssue_svds.zipis too small - just 4000 by 24.With such a small size, there is hardly any point in using
svdsrather that much more robust and accuratesvd- on my laptop 1000svdscalls took 2.14 sec and 1000svdcalls 2.7 sec.Actually,
svdsbecomes much more efficient thansvdifmin(data.shape) >> k, in which case the extra cost of the new more reliablesvdsbecomes negligible.For what it’s worth, I have been running a similar post-processing for some time for reasonably large dense matrices. A typical run computes
k=100eigenvectors out ofncv=300generated Lanczos vectors of a (20k, 20k) matrix witheigsh, then callssvd(A @ eigvec).Time spent in
scipy.linalg.svdis less than 0.2% of the time spent ineigsh, so in such a case the additional cost is negligible.