Implemention of `cupyx.scipy.ndimage.filters`
See original GitHub issueI have implemented nearly all of cupyx.scipy.ndimage.filters
matching scipy.ndimage.filters
exactly. However, before I finish it I would like to confirm that I am developing it correctly. I have the file currently in a gist: https://gist.github.com/coderforlife/d953303da4bb7d8d28e49a568cb107b2
This works towards https://github.com/cupy/cupy/issues/2099. One of their requests was one of the missing features though.
Current Lacking Features:
- Does not have the implementations of
generic_filter()
orgeneric_filter1d()
. - I want to make some changes to how rank filters work (includes
rank_filter
,median_filter
,percentile_filter
) based on the answers below. - There are no tests.
- There are no function docs (but most would just refer to the scipy docs).
Changes From Current cupyx.scipy.ndimage.filters
:
- Implements tons of new functions (and they re-use slightly adjusted code that was already there).
convolution()
andcorrelate()
were adjusted to support non-contiguous input arrays without copying them (weights are still forced to be contiguous, but they are expected to be quite small).convolution()
andcorrelate()
are slightly (~5%) faster now.- The
ElementwiseKernel
s generated are now more runtime adjustable.
Open Questions:
- The
ElementwiseKernel
s are only used internally and are never returned. Should they have as much as possible compiled into them? The current version of the code compiles the number of dimensions, input shape, weights shape, border mode, constant value, and origin values. My version moves input shape (and actually strides), constant value, and origin values to be runtime values. Is it better to have more compile time values and thus need to compile newElementwiseKernel
s more frequently or have more things moved to runtime values and need fewer compiles? - To implement
generic_filter()
andgeneric_filter1d()
I think they should take aReductionKernel
(or a Python function that can be fused into aReductionKernel
). However, Cuda does not let you call functions in other modules. So to accomplish this I was thinking about re-creating the reduction kernel code in the convolution kernel like at https://github.com/cupy/cupy/blob/de3d32536c39eefea1805f4a7a76a14356eb3bff/cupy/core/_reduction.pyx#L36 however it would actually be simpler since the entire reduction would happen in a single thread and not need synchronization. Is this a reasonable approach? - Is it okay that numpy is used to create some of the filters internally? It is several orders of magnitude faster than doing it in cupy, even at the large kernels of 25.
- Is the quality of the code okay?
Thanks for your consideration.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:4
- Comments:30 (26 by maintainers)
Top Results From Across the Web
cupyx.scipy.ndimage.uniform_filter — CuPy 11.4.0 ...
Multi-dimensional uniform filter. Parameters. input (cupy.ndarray) – The input array. size (int or sequence of int) – ...
Read more >Why cupyx.scipy.ndimage.median_filter() is slower than CPU ...
I just recently came across a case and found that when I use cupyx.scipy.ndimage.median_filter() API, (with CUDA10 or CUDA11), ...
Read more >Local maxima detection with cupy and friends - Image.sc Forum
Hi python / CUDA folks, I'm searching for a (local) maxima finder for 3D data that can make use of GPUs. Ideally it...
Read more >CuPy Documentation - Read the Docs
Multidimensional Image Processing (cupyx.scipy.ndimage.*) ... It is also possible to easily implement custom CUDA kernels that work with ndarray using:.
Read more >SciPy Internship: 2021-2022 - Scientific Python blog
I implemented the dev.py interface that works in a similar way to ... from scipy import ndimage >>> ndimage.filters.gaussian_filter is ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@grlee77 You originally requested to see if 1d and nd could be combined, and the answer is yes. With a few tweaks to nd I was able to make 1d work with the nd kernel generation but maintain its speed. I have updating the gist with this and will be updating the PR shortly. At this point all kernels in ndimage.filters are the same base kernel with a few substitutions. I am also curious about making the shared memory kernel like you suggested somewhere and will be playing with that soon.
@Skielex the most recent gist has addressed the speed issues you were seeing (along with the bugs). Length-1 dimensions do not impact significantly speed any more and the speed is more reasonable in general.
Yes, scipy/scipy#11661