question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Implemention of `cupyx.scipy.ndimage.filters`

See original GitHub issue

I have implemented nearly all of cupyx.scipy.ndimage.filters matching scipy.ndimage.filters exactly. However, before I finish it I would like to confirm that I am developing it correctly. I have the file currently in a gist: https://gist.github.com/coderforlife/d953303da4bb7d8d28e49a568cb107b2

This works towards https://github.com/cupy/cupy/issues/2099. One of their requests was one of the missing features though.

Current Lacking Features:

  • Does not have the implementations of generic_filter() or generic_filter1d().
  • I want to make some changes to how rank filters work (includes rank_filter, median_filter, percentile_filter) based on the answers below.
  • There are no tests.
  • There are no function docs (but most would just refer to the scipy docs).

Changes From Current cupyx.scipy.ndimage.filters:

  • Implements tons of new functions (and they re-use slightly adjusted code that was already there).
  • convolution() and correlate() were adjusted to support non-contiguous input arrays without copying them (weights are still forced to be contiguous, but they are expected to be quite small).
  • convolution() and correlate() are slightly (~5%) faster now.
  • The ElementwiseKernels generated are now more runtime adjustable.

Open Questions:

  • The ElementwiseKernels are only used internally and are never returned. Should they have as much as possible compiled into them? The current version of the code compiles the number of dimensions, input shape, weights shape, border mode, constant value, and origin values. My version moves input shape (and actually strides), constant value, and origin values to be runtime values. Is it better to have more compile time values and thus need to compile new ElementwiseKernels more frequently or have more things moved to runtime values and need fewer compiles?
  • To implement generic_filter() and generic_filter1d() I think they should take a ReductionKernel (or a Python function that can be fused into a ReductionKernel). However, Cuda does not let you call functions in other modules. So to accomplish this I was thinking about re-creating the reduction kernel code in the convolution kernel like at https://github.com/cupy/cupy/blob/de3d32536c39eefea1805f4a7a76a14356eb3bff/cupy/core/_reduction.pyx#L36 however it would actually be simpler since the entire reduction would happen in a single thread and not need synchronization. Is this a reasonable approach?
  • Is it okay that numpy is used to create some of the filters internally? It is several orders of magnitude faster than doing it in cupy, even at the large kernels of 25.
  • Is the quality of the code okay?

Thanks for your consideration.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:4
  • Comments:30 (26 by maintainers)

github_iconTop GitHub Comments

1reaction
coderforlifecommented, Mar 12, 2020

@grlee77 You originally requested to see if 1d and nd could be combined, and the answer is yes. With a few tweaks to nd I was able to make 1d work with the nd kernel generation but maintain its speed. I have updating the gist with this and will be updating the PR shortly. At this point all kernels in ndimage.filters are the same base kernel with a few substitutions. I am also curious about making the shared memory kernel like you suggested somewhere and will be playing with that soon.

@Skielex the most recent gist has addressed the speed issues you were seeing (along with the bugs). Length-1 dimensions do not impact significantly speed any more and the speed is more reasonable in general.

1reaction
coderforlifecommented, Mar 11, 2020

Yes, scipy/scipy#11661

Read more comments on GitHub >

github_iconTop Results From Across the Web

cupyx.scipy.ndimage.uniform_filter — CuPy 11.4.0 ...
Multi-dimensional uniform filter. Parameters. input (cupy.ndarray) – The input array. size (int or sequence of int) – ...
Read more >
Why cupyx.scipy.ndimage.median_filter() is slower than CPU ...
I just recently came across a case and found that when I use cupyx.scipy.ndimage.median_filter() API, (with CUDA10 or CUDA11), ...
Read more >
Local maxima detection with cupy and friends - Image.sc Forum
Hi python / CUDA folks, I'm searching for a (local) maxima finder for 3D data that can make use of GPUs. Ideally it...
Read more >
CuPy Documentation - Read the Docs
Multidimensional Image Processing (cupyx.scipy.ndimage.*) ... It is also possible to easily implement custom CUDA kernels that work with ndarray using:.
Read more >
SciPy Internship: 2021-2022 - Scientific Python blog
I implemented the dev.py interface that works in a similar way to ... from scipy import ndimage >>> ndimage.filters.gaussian_filter is ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found