DOC: complex conjugation in BLAS functions
See original GitHub issueIs your feature request related to a problem? Please describe.
I have a very large complex array A
and I want to apply its adjoint (conjugate transpose) to a vector y
. If I use A.conj().T @ y
then A.conj()
is allocated explicitly. That consumes too much memory. It’s possible to bypass allocation by calling scipy.linalg.blas.zgemv(1.0, A, y, trans=2)
.
The trouble is: the meaning of the trans
parameter isn’t specified in SciPy’s zgemv documentation. I only learned of this solution by guessing the behavior of trans
and trying it out.
Describe the solution you’d like.
SciPy’s documentation for complex BLAS functions should indicate how to perform the conjugate-transpose operation. I.e., explain the meanings of trans = 0, 1, or 2 in documentation for the relevant functions.
I think the change is reasonable because people who are familiar with common C or FORTRAN BLAS interfaces expect trans to be a character (N, T, or H).
Describe alternatives you’ve considered.
N/A
Additional context (e.g. screenshots, GIFs)
Code snippet showing undocumented functionality:
import numpy as np
import scipy.linalg as la
import scipy.linalg.blas as blas
A = np.random.randn(3,2) + 1j*np.random.randn(3,2)
y = np.random.randn(3) + 1j
inefficient = A.T.conj() @ y
efficient = blas.zgemv(1.0, A, y, trans=2)
print(la.norm(inefficient - efficient))
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:5 (3 by maintainers)
I am taking a look into this - the difficulty is that these are autogenerated modules, and the documentation is also autogenerated from the wrappers, without any extra info about the parameters that f2py can parse. I don’t see any immediate solutions other than being able to use a comment as extra info for the docstring.
Great. So, what are some possibilities for moving forward here? Is there a brute-force option where someone (e.g., me) could expand a couple dozen documentation files with a suitable phrase for complex conjugation? I’d suggest something like “Here, setting trans = 0, 1, 2 correspond to trans = ‘N’, ‘T’, ‘H’ in the Netlib LAPACK implementation.” You seem to also suggest that modifying f2py might work. Who could we talk to to figure out if that’s actually viable?