Feature request: Support cuFFT callback routines
See original GitHub issueThis is something that @maxpkatz and I tried a while ago but we never figured it out. It would be nice to support the cuFFT callback routines in CuPy: https://docs.nvidia.com/cuda/cufft/index.html#callback-routines https://developer.nvidia.com/blog/cuda-pro-tip-use-cufft-callbacks-custom-data-processing/ The idea is to reduce access to global memory by fusing both elementwise pre- and post- processing kernels with the FFT operations to increase the performance.
The problem we encountered was that it requires linking to cuFFT’s static library instead of shared library, but if we do this by changing this line
https://github.com/cupy/cupy/blob/5dcd1a53c2c6667b6cfafc132402554f8e5d3299/cupy_setup_build.py#L141
to 'cufftw'
'cufft_static'
as instructed by the cuFFT documentation and rebuild CuPy (to be precise, the cupy.cuda.cufft
module), we encounter “undefined symbols” errors when importing CuPy. Those symbols are for the callback support and are leaked from libcufftw.a
libcufft_static.a
into the module object.
I can work out this support myself, but I need someone to help me resolve this issue.
Issue Analytics
- State:
- Created 3 years ago
- Comments:16 (13 by maintainers)
Top GitHub Comments
I have a working prototype in this branch! Will polish it and send a PR.
I just quickly wanted to give a thumbs up here for the discussion. Easily fusing affine transformations or other elementwise kernels with the FFT or similar compute kernels is a very important feature for science.