Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature request: Support cuFFT callback routines

See original GitHub issue

This is something that @maxpkatz and I tried a while ago but we never figured it out. It would be nice to support the cuFFT callback routines in CuPy: https://docs.nvidia.com/cuda/cufft/index.html#callback-routines https://developer.nvidia.com/blog/cuda-pro-tip-use-cufft-callbacks-custom-data-processing/ The idea is to reduce access to global memory by fusing both elementwise pre- and post- processing kernels with the FFT operations to increase the performance.

The problem we encountered was that it requires linking to cuFFT’s static library instead of shared library, but if we do this by changing this line https://github.com/cupy/cupy/blob/5dcd1a53c2c6667b6cfafc132402554f8e5d3299/cupy_setup_build.py#L141 to ~~'cufftw'~~ 'cufft_static' as instructed by the cuFFT documentation and rebuild CuPy (to be precise, the cupy.cuda.cufft module), we encounter “undefined symbols” errors when importing CuPy. Those symbols are for the callback support and are leaked from ~~libcufftw.a~~ libcufft_static.a into the module object.

I can work out this support myself, but I need someone to help me resolve this issue.

cc: @bjoernenders @smarkesini

Issue Analytics

State:
Created 3 years ago
Comments:16 (13 by maintainers)

Top GitHub Comments

2reactions

leofangcommented, Oct 15, 2020

The undefined symbol error can be reproduced simply by building from my branch: https://github.com/leofang/cupy/tree/cufft_retry.

I have a working prototype in this branch! Will polish it and send a PR.

2reactions

bjoernenderscommented, Oct 9, 2020

I just quickly wanted to give a thumbs up here for the discussion. Easily fusing affine transformations or other elementwise kernels with the FFT or similar compute kernels is a very important feature for science.

Top Results From Across the Web

CUDA Pro Tip: Use cuFFT Callbacks for Custom Data ...

Callback routines are user-supplied device functions that cuFFT calls when loading or storing data. You can use callbacks to implement many pre- ...

Add support for CuFFT callback functions #614 - GitHub

Compute and store/integrate the abs2 (and, for spectral kurtosis, the abs2^2 ) of the CuFFT output samples as they are being stored, thereby ......

Why does cuFFT performance suffer with overlapping inputs?

I'm experimenting with using cuFFT's callback feature to perform input format conversion on the fly (for instance, calculating FFTs of 8-bit ...

GPU Fast Convolution via the Overlap-and-Save Method in ...

The most efficient way to implement cuFFT-OLS is to utilize a feature of the cuFFT library called callbacks. The cuFFT callbacks allow the...

cuda-samples - Gitee

Samples for CUDA Developers which demonstrates features in CUDA Toolkit. ... CUFFT Callback Routines are user-supplied kernel routines that CUFFT will call ...