question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature request: Support cuFFT callback routines

See original GitHub issue

This is something that @maxpkatz and I tried a while ago but we never figured it out. It would be nice to support the cuFFT callback routines in CuPy: https://docs.nvidia.com/cuda/cufft/index.html#callback-routines https://developer.nvidia.com/blog/cuda-pro-tip-use-cufft-callbacks-custom-data-processing/ The idea is to reduce access to global memory by fusing both elementwise pre- and post- processing kernels with the FFT operations to increase the performance.

The problem we encountered was that it requires linking to cuFFT’s static library instead of shared library, but if we do this by changing this line https://github.com/cupy/cupy/blob/5dcd1a53c2c6667b6cfafc132402554f8e5d3299/cupy_setup_build.py#L141 to 'cufftw' 'cufft_static' as instructed by the cuFFT documentation and rebuild CuPy (to be precise, the cupy.cuda.cufft module), we encounter “undefined symbols” errors when importing CuPy. Those symbols are for the callback support and are leaked from libcufftw.a libcufft_static.a into the module object.

I can work out this support myself, but I need someone to help me resolve this issue.

cc: @bjoernenders @smarkesini

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:16 (13 by maintainers)

github_iconTop GitHub Comments

2reactions
leofangcommented, Oct 15, 2020

The undefined symbol error can be reproduced simply by building from my branch: https://github.com/leofang/cupy/tree/cufft_retry.

I have a working prototype in this branch! Will polish it and send a PR.

2reactions
bjoernenderscommented, Oct 9, 2020

I just quickly wanted to give a thumbs up here for the discussion. Easily fusing affine transformations or other elementwise kernels with the FFT or similar compute kernels is a very important feature for science.

Read more comments on GitHub >

github_iconTop Results From Across the Web

CUDA Pro Tip: Use cuFFT Callbacks for Custom Data ...
Callback routines are user-supplied device functions that cuFFT calls when loading or storing data. You can use callbacks to implement many pre- ...
Read more >
Add support for CuFFT callback functions #614 - GitHub
Compute and store/integrate the abs2 (and, for spectral kurtosis, the abs2^2 ) of the CuFFT output samples as they are being stored, thereby ......
Read more >
Why does cuFFT performance suffer with overlapping inputs?
I'm experimenting with using cuFFT's callback feature to perform input format conversion on the fly (for instance, calculating FFTs of 8-bit ...
Read more >
GPU Fast Convolution via the Overlap-and-Save Method in ...
The most efficient way to implement cuFFT-OLS is to utilize a feature of the cuFFT library called callbacks. The cuFFT callbacks allow the...
Read more >
cuda-samples - Gitee
Samples for CUDA Developers which demonstrates features in CUDA Toolkit. ... CUFFT Callback Routines are user-supplied kernel routines that CUFFT will call ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found