question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[FEA] Added functionality to ElementwiseKernel

See original GitHub issue

To begin, cuSignal has increased its use of CuPy’s Elementwisk Kernel functionality with great success!

I would like to request two additional features.

Performance

  1. It is known that adding the __restrict__ flag to pointer parameters allows the compiler to perform additional optimizations. Also, adding const to read-only data. https://developer.nvidia.com/blog/cuda-pro-tip-optimize-pointer-aliasing/ It would be great those two options were possible for input and output (only __restrict__) parameters.

Functionality

  1. Passing in dtype for type inference. Currently, if a CuPy Elementwise Kernel can’t infer a data type, one must be hardcoded. I have discovered it’s faster not to create an empty array for output and just pass size= to a Elementwise Kernel. But then I have to hardcoded the data type of the output (if there’s not input array).

As an example,

_bohman_kernel = cp.ElementwiseKernel(
    "",
    "float64 w",
    """
    double fac { abs( start + delta * ( i - 1 ) ) };
    if ( i != 0 && i != ( _ind.size() - 1 ) ) {
        w = ( 1 - fac ) * cos( M_PI * fac ) + 1.0 / M_PI * sin( M_PI * fac );
    } else {
        w = 0;
    }
    """,
    "_bohman_kernel",
    options=("-std=c++11",),
    loop_prep="double delta { 2.0 / ( _ind.size() - 1 ) }; \
               double start { -1.0 + delta };",
)

w = _bohman_kernel(size=M)

Therefore, if I want the option of float64 and float32 I need to create two kernel and logic to select correct kernel.

I would be great if I could pass dtype, maybe something like

_bohman_kernel = cp.ElementwiseKernel(
    "",
    "T w, C a",
    """
    T fac { abs( start + delta * ( i - 1 ) ) };
    if ( i != 0 && i != ( _ind.size() - 1 ) ) {
        w = ( 1 - fac ) * cos( M_PI * fac ) + 1.0 / M_PI * sin( M_PI * fac );
        a = C(0, w);
    } else {
        w = 0;
        a = C(w, 0);
    }
    """,
    "_bohman_kernel",
    options=("-std=c++11",),
    loop_prep="double delta { 2.0 / ( _ind.size() - 1 ) }; \
               double start { -1.0 + delta };",
    
)

w = _bohman_kernel(size=M, dtype=( ("T", float64), ("C", complex128) ), )

@z-ryan1 @awthomp @leofang

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:1
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

3reactions
emcastillocommented, Oct 14, 2020

I think we definitely have to add the __restrict__ to both elementwise and reductions. I will work on an implementation and do some benchmarking

2reactions
mnicelycommented, Oct 20, 2020

@leofang Not dumb at all 😄 it’s just personal preference. I like how it catches illegal narrowing at compile time.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Make your Python functions 10x faster | by Rushabh Vasani
Learn to use the ElementwiseKernel API to accelerate your python code on GPU with CUDA and to speed up your NumPy code!
Read more >
Parallel Algorithms - pyopencl 2022.2.4 documentation
Generate a kernel that takes a number of scalar or vector arguments (at least one vector argument), performs the map_expr on each entry...
Read more >
How to use the cupy.ElementwiseKernel function in cupy - Snyk
ElementwiseKernel function in cupy. To help you get started, we've selected a few cupy examples, based on popular ways it is used in...
Read more >
User-Defined Kernels — CuPy 11.4.0 documentation
An elementwise kernel can be defined by the ElementwiseKernel class. ... We can tell the ElementwiseKernel class to use manual indexing by adding...
Read more >
Chainer Documentation
ElementwiseKernel class, and Chainer wraps it by ... GradientMethod, which adds some features dedicated for the first order methods.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found