Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ROCm RawModule template kernel with complex

See original GitHub issue

Conditions (you can just paste the output of python -c 'import cupy; cupy.show_config()')
- CuPy version 9.1.0
- OS/Platform AMD ROCm
Code to reproduce

import cupy

code = r'''
#include <cupy/complex.cuh>
template<typename T>
__global__ void func(T* in_arr) { /* do something */ }
'''

kers = ('func<int>', 'func<complex<double>>')
mod = cupy.RawModule(code=code, options=('--std=c++11',),
                         name_expressions=kers, translate_cucomplex=False)

ker_int = mod.get_function(kers[1])

Error messages, stack traces, or logs

When calling RawModule.get_function() using kernels with complex template names

Traceback (most recent call last):
  File "test.py", line 15, in <module>
    ker_int = mod.get_function(kers[1])
  File "cupy/_core/raw.pyx", line 485, in cupy._core.raw.RawModule.get_function
  File "cupy/_core/raw.pyx", line 96, in cupy._core.raw.RawKernel.kernel.__get__
  File "cupy/_core/raw.pyx", line 117, in cupy._core.raw.RawKernel._kernel
  File "cupy/cuda/function.pyx", line 253, in cupy.cuda.function.Module.get_function
  File "cupy/cuda/function.pyx", line 194, in cupy.cuda.function.Function.__init__
  File "cupy_backends/cuda/api/driver.pyx", line 269, in cupy_backends.cuda.api.driver.moduleGetFunction
  File "cupy_backends/cuda/api/driver.pyx", line 125, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: hipErrorNotFound: hipErrorNotFound

However if I do not use templates, the kernel runs fine. Is that feature not supported on ROCm?

Issue Analytics

State:
Created 2 years ago
Comments:20 (15 by maintainers)

Top GitHub Comments

2reactions

amathews-amdcommented, Jul 12, 2021

Internal ticket: https://ontrack-internal.amd.com/browse/SWDEV-294764

1reaction

amathews-amdcommented, Sep 20, 2021

@yxsamliu is the expert here, but this is my understanding (it may be wrong).

In https://docs.nvidia.com/cuda/nvrtc/index.html#accessing-lowered-names, it is mentioned “NVRTC will parse the name expression string as a C++ constant expression at the end of the user program”. If the name expression string is parsed in the context of the program scope, it should get the using directive as you mentioned. But if the name expression string is parsed outside the context of the program scope, you will have to use fully qualified names.

This might be the source of the confusion. Since the NV documentation does not make it clear, it might be better to be conservative and support the usage that provides context-free naming and clarity.

Top Results From Across the Web

Kernel Language — ROCm 4.5.0 documentation

Introduction¶. HIP provides a C++ syntax that is suitable for compiling most code that commonly appears in compute kernels, including classes, namespaces, ...

hiprtcCompileProgram can not recognise the -I option #2182

The test was conducted using the rocm/dev-ubuntu-16.04:3.5 docker image from ... ROCm RawModule template kernel with complex cupy/cupy#5436.

CuPy Documentation - Read the Docs

CuPy has an experimental support for AMD GPU (ROCm). ... To support C++ template kernels, RawModule additionally provide a name_expressions ...

cupy.ndarray

Raw kernels operating on complex-valued arrays can be created as well: ... as of CUDA Toolkit 10.1, see the introduction to RawModule below....

introduction to amd gpu programming with hip

Overview of GPU Kernels ... Open Compute (ROCm). No kernel drivers involved. ... C++ Template Library for. Linear Algebra.