`SuperLU.solve` leaks memory for `trans="T"`
See original GitHub issueDescription
I’m trying to use the sparse linear solver to repeatedly solve a problem where the LHS matrix is constantly changing (and I need both it and its transpose). However, doing this seems to blow up my CUDA memory fairly quickly. If I refactor my code and change splu(A).solve(b, trans="T")
into splu(A.T).solve(b)
the memory stays constant throughout the program.
To Reproduce
A: cp.sparse.coo_matrix = ... # some big sparse matrix
b = cp.asarray(torch.zeros(A.shape[0]))
for i in range(2000):
lu = linalg.splu(A)
x = lu.solve(b, trans="T")
Observe the memory through nvidia-smi -l
until eventually getting the error CuSparseError: CUSPARSE_STATUS_ALLOC_FAILED
.
On the other hand, this works fine:
A: cp.sparse.coo_matrix = ... # some big sparse matrix
b = cp.asarray(torch.zeros(A.shape[0]))
for i in range(2000):
lu = linalg.splu(A.T)
x = lu.solve(b)
Installation
Conda-Forge (conda install ...
)
Environment
OS : Linux-5.4.0-122-generic-x86_64-with-glibc2.31
Python Version : 3.10.5
CuPy Version : 11.0.0
CuPy Platform : NVIDIA CUDA
NumPy Version : 1.23.1
SciPy Version : 1.9.0
Cython Build Version : 0.29.30
Cython Runtime Version : None
CUDA Root : /home/anadodik/miniconda3/envs/donut
nvcc PATH : None
CUDA Build Version : 11020
CUDA Driver Version : 11070
CUDA Runtime Version : 11060
cuBLAS Version : (available)
cuFFT Version : 10600
cuRAND Version : 10209
cuSOLVER Version : (11, 3, 2)
cuSPARSE Version : (available)
NVRTC Version : (11, 6)
Thrust Version : 101000
CUB Build Version : 101000
Jitify Build Version : 3c4a4ba
cuDNN Build Version : 8401
cuDNN Version : 8401
NCCL Build Version : 21212
NCCL Runtime Version : 21304
cuTENSOR Version : 10500
cuSPARSELt Build Version : None
Device 0 Name : NVIDIA TITAN Xp
Device 0 Compute Capability : 61
Device 0 PCI Bus ID : 0000:04:00.0
Additional Information
No response
Issue Analytics
- State:
- Created a year ago
- Comments:13 (6 by maintainers)
Top Results From Across the Web
Amesos2: Memory Leak with SuperLU · Issue #5988 - GitHub
When running SPARC with clang's address sanitizer we see reports of memory leaks that point to using SuperLU in amesos2.
Read more >Evaluation of SuperLU on multicore architectures - IOPscience
In this paper, we study the factorization and triangular solution kernels in the sparse direct solver SuperLU [1] on two leading CMP systems....
Read more >SuperLU: Home Page - NERSC
SuperLU is a general purpose library for the direct solution of large, ... Fixed memory leaks and a few other bugs in parallel...
Read more >A comparison of SuperLU solvers on the intel MIC architecture
In this work, sequential, multithreaded and distributed versions of SuperLU solvers (see [2]) are examined on the Intel Xeon Phi coprocessors.
Read more >GPU Capable Sparse Direct Solvers - YouTube
In this tutorial we illustrate the use of the sparse direct solvers and factorization based preconditioners SuperLU and STRUMPACK on modern ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
seems to be a leak in nvidia libraries and not CuPy
To be fixed in #7039.