Question: complex number support in Array API?
See original GitHub issueI am wondering the reason that complex numbers are not considered in the Array API, and if we could give a second thought to make them native dtypes in the API.
The Dataframe API is not considered in the rest of this issue 🙂
I spent quite some time on making sure complex numbers are first-class citizens in CuPy, as many scientific computing applications require using complex numbers. In quantum mechanics, for example, complex numbers are the cornerstones and we can’t live without them. Even in some machine learning / deep learning works that we do, either classical or quantum (yes, for those who don’t know already there is quantum machine learning 😁), we also need complex numbers in various places like building tensors or communicating with simulations, especially those applying physics-aware neural networks, so it is a great pain to us not being able to build and operate on complex numbers natively.
To date, complex numbers are also an integral part of mainstream programming languages. For example, C has it since C99, and so is C++ (std::complex
). Our beloved Python has complex
too, so it is just so weird IMHO that when we talk about native dtypes they’re being excluded.
As for language extensions to support GPUs, in CUDA we have thrust::complex
(which currently supports complex64
/complex128
) as a clone of std::complex
and it is likely that libcu++
will replace Thrust on this aspect, and in ROCm there’s also a Thrust clone and native support in HIP, so at least on NVIDIA/AMD GPUs we are good.
Turning to library support, as far as I know
- NumPy supports
complex64
/complex128
, but notcomplex32
(https://github.com/numpy/numpy/issues/14753) - CuPy supports
complex64
/complex128
, andcomplex32
is being evaluated (ex: https://github.com/cupy/cupy/pull/4454) - PyTorch’s support for
complex32
/complex64
/complex128
is catching up (I am unaware of any meta-issue summarizing the status quo, but the labelmodule: complex
is a good reference - SciPy /
cupyx.scipy
has many components supporting complex numbers, the most recent prominent case being the extensivendimage
overhaul (ex: https://github.com/scipy/scipy/pull/12725) done by @grlee77 for image processing (yes, image processing also needs complex numbers!)
The reason I also mention complex32
above is because CUDA now provides complex32
support in some CUDA libraries like cuBLAS and cuFFT. With special hardware acceleration over float16
, it is expected that complex32
can also benefit, see the preliminary FFT test being done in https://github.com/cupy/cupy/pull/4407. Hopefully by having complex number support in ML/DL frameworks (complex64
and complex128
are enough to start) many more applications can be benefited as well.
I am aware that Array API picks DLPack as the primary protocol for zero-copy data exchange, and that it currently lacks complex number support. This is one of the reasons I do not like DLPack. While I will create a separate issue to discuss about alternatives to DLPack, I think revising DLPack’s format is fairly straightforward (and should be done asap regardless of the Array API standardization due to the need of ML/DL libraries).
Disclaimer: This issue is merely for my research interests (relevant to my and other colleagues’ work) and is not driven by CuPy, one of the Array API stakeholders I will represent.
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:7 (7 by maintainers)
Top GitHub Comments
Thanks for bringing this up @leofang. You’re not the first to ask, so it’s good to document this decision and have a summary of the current status.
The main issue is that not all array libraries have good support for complex dtypes yet. Complex numbers are quite important for science, but are only of marginal importance to deep learning. TensorFlow, PyTorch and MXNet all don’t have great support yet. When there’s such partial support of a feature, we have in some cases chosen to include the feature in the current version of the standard if it was not too difficult for those libraries to implement it. But for complex support, it’s a ton of work. Hence it would be a feature that would only be fully supported by ~50% of libraries.
Excluding it from the standard doesn’t mean CuPy cannot have it - it just means it is not part of the standard, so shouldn’t have those dtypes in the separate array-api-supporting namespace. Which signals to users that as of today one cannot write code that is portable between libraries using complex. They can still use it in CuPy (and NumPy, JAX).
PyTorch’s implementation is indeed getting there, but not yet complete. TensorFlow does have an implementation in progress, but is further behind AFAIK. In 12 months from now it should be feasible to add
complex64
andcomplex128
to the standard I believe.We had a similar discussion about
bfloat16
- that’s a dtype that deep learning libraries consider much more important thancomplex64/128
, but NumPy doesn’t have it and is quite reluctant to add it.Yep, that’s been a mess for a long time. One thing about this standard is that we try to not do too much innovation; if a feature is still under discussion or being changed in a library, in most cases we should choose wait-and-see (maybe ensuring that libraries don’t make incompatible choices), and then only add it to the standard if things have stabilized. So with an issue like “sorting behaviour for complex numbers” I’d choose to leave it out, since sorting isn’t all that important for the physics/engineering type applications that need complex numbers.
The CuPy implementation seems to use
float16
, but I would have guessed that the hardware acceleration is forbfloat16
. Can it accelerate regular half-precision too?+1 to that. The solution is identical, so better get the explicit error - it’s almost always user error anyway.
https://numpy.org/neps/nep-0041-improved-dtype-support.html https://numpy.org/neps/nep-0042-new-dtypes.html https://numpy.org/neps/nep-0043-extensible-ufuncs.html
Large parts of that are landing in NumPy 1.20.0 next month.
some PyTorch casting rules are getting closer, like automatic integer to float promotion implementation for functions that return floats is almost complete and numpy-like.
documentation is a bit limited; casting rule docs are at https://pytorch.org/docs/stable/tensor_attributes.html#type-promotion-doc. implementation under the hood is very different, I like http://blog.ezyang.com/2019/05/pytorch-internals/ as a guide.