CuPy JIT failure in ROCm
See original GitHub issueReproducible on ROCm 3.5.0 and 4.0.0:
$ pytest tests/cupyx_tests/jit_tests/
========================================================================= test session starts =========================================================================
platform linux -- Python 3.7.8, pytest-6.0.2, py-1.9.0, pluggy-0.13.1
rootdir: /home/leofang/dev/cupy_rocm350, configfile: setup.cfg
collected 9 items
tests/cupyx_tests/jit_tests/test_raw.py ....F.... [100%]
============================================================================== FAILURES ===============================================================================
_______________________________________________________________ TestRaw.test_raw_multidimensional_array _______________________________________________________________
self = <cupy.cuda.compiler._NVRTCProgram object at 0x7f66aff1fc50>
options = ('-D CUPY_JIT_MODE', '-I/home/leofang/dev/cupy_rocm350/cupy/_core/include', '-I/opt/rocm/include'), log_stream = None
def compile(self, options=(), log_stream=None):
try:
if self.name_expressions:
for ker in self.name_expressions:
nvrtc.addAddNameExpression(self.ptr, ker)
> nvrtc.compileProgram(self.ptr, options)
cupy/cuda/compiler.py:623:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> cpdef compileProgram(intptr_t prog, options):
cupy_backends/cuda/libs/nvrtc.pyx:133:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> check_status(status)
cupy_backends/cuda/libs/nvrtc.pyx:145:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
> raise NVRTCError(status)
E cupy_backends.cuda.libs.nvrtc.NVRTCError: HIPRTC_ERROR_COMPILATION (6)
cupy_backends/cuda/libs/nvrtc.pyx:64: NVRTCError
During handling of the above exception, another exception occurred:
self = <cupyx_tests.jit_tests.test_raw.TestRaw testMethod=test_raw_multidimensional_array>
def test_raw_multidimensional_array(self):
@jit.rawkernel()
def f(x, y, n_row, n_col):
tid = jit.threadIdx.x + jit.blockDim.x * jit.blockIdx.x
ntid = jit.blockDim.x * jit.gridDim.x
size = n_row * n_col
for i in range(tid, size, ntid):
i_row = i // n_col
i_col = i % n_col
y[i_row, i_col] = x[i_row, i_col]
n, m = numpy.uint32(12), numpy.uint32(13)
x = testing.shaped_random((n, m), dtype=numpy.int32, seed=0)
y = testing.shaped_random((n, m), dtype=numpy.int32, seed=1)
> f((5,), (6,), (x, y, n, m))
tests/cupyx_tests/jit_tests/test_raw.py:59:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
cupyx/jit/_interface.py:71: in __call__
options=('-D CUPY_JIT_MODE',))
cupy/_core/core.pyx:1956: in cupy._core.core.compile_with_cache
cpdef function.Module compile_with_cache(
cupy/_core/core.pyx:2021: in cupy._core.core.compile_with_cache
return cuda.compile_with_cache(
cupy/cuda/compiler.py:430: in compile_with_cache
name_expressions, log_stream, cache_in_memory)
cupy/cuda/compiler.py:813: in _compile_with_cache_hip
log_stream, cache_in_memory)
cupy/cuda/compiler.py:272: in compile_using_nvrtc
name_expressions, log_stream, jitify)
cupy/cuda/compiler.py:255: in _compile
ptx, mapping = prog.compile(options, log_stream)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <cupy.cuda.compiler._NVRTCProgram object at 0x7f66aff1fc50>
options = ('-D CUPY_JIT_MODE', '-I/home/leofang/dev/cupy_rocm350/cupy/_core/include', '-I/opt/rocm/include'), log_stream = None
def compile(self, options=(), log_stream=None):
try:
if self.name_expressions:
for ker in self.name_expressions:
nvrtc.addAddNameExpression(self.ptr, ker)
nvrtc.compileProgram(self.ptr, options)
mapping = None
if self.name_expressions:
mapping = {}
for ker in self.name_expressions:
mapping[ker] = nvrtc.getLoweredName(self.ptr, ker)
if log_stream is not None:
log_stream.write(nvrtc.getProgramLog(self.ptr))
# TODO(leofang): use getCUBIN() for _cuda_version >= 11010?
return nvrtc.getPTX(self.ptr), mapping
except nvrtc.NVRTCError:
log = nvrtc.getProgramLog(self.ptr)
raise CompileException(log, self.src, self.name, options,
> 'nvrtc' if not runtime.is_hip else 'hiprtc')
E cupy.cuda.compiler.CompileException: /tmp/comgr-d17f2a/input/CompileSource:5417:7: error: no member named '_indexing' in 'CArray<int, 2, true, true>'
E y._indexing(thrust::make_tuple(i_row, i_col)) = x._indexing(thrust::make_tuple(i_row, i_col));
E ~ ^
E /tmp/comgr-d17f2a/input/CompileSource:5417:17: error: no member named 'make_tuple' in namespace 'thrust'; did you mean 'std::make_tuple'?
E y._indexing(thrust::make_tuple(i_row, i_col)) = x._indexing(thrust::make_tuple(i_row, i_col));
E ^~~~~~~~~~~~~~~~~~
E std::make_tuple
E /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/tuple:1448:5: note: 'std::make_tuple' declared here
E make_tuple(_Elements&&... __args)
E ^
E /tmp/comgr-d17f2a/input/CompileSource:5417:55: error: no member named '_indexing' in 'CArray<int, 2, true, true>'
E y._indexing(thrust::make_tuple(i_row, i_col)) = x._indexing(thrust::make_tuple(i_row, i_col));
E ~ ^
E /tmp/comgr-d17f2a/input/CompileSource:5417:65: error: no member named 'make_tuple' in namespace 'thrust'; did you mean 'std::make_tuple'?
E y._indexing(thrust::make_tuple(i_row, i_col)) = x._indexing(thrust::make_tuple(i_row, i_col));
E ^~~~~~~~~~~~~~~~~~
E std::make_tuple
E /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/tuple:1448:5: note: 'std::make_tuple' declared here
E make_tuple(_Elements&&... __args)
E ^
E 4 errors generated when compiling for gfx906.
E Error: Failed to compile opencl source (from CL or HIP source to LLVM IR).
cupy/cuda/compiler.py:636: CompileException
======================================================================= short test summary info =======================================================================
FAILED tests/cupyx_tests/jit_tests/test_raw.py::TestRaw::test_raw_multidimensional_array - cupy.cuda.compiler.CompileException: /tmp/comgr-d17f2a/input/CompileSourc...
===================================================================== 1 failed, 8 passed in 8.76s =====================================================================
There are multiple problems:
- hipRTC apparently does not recognize
-D
, so any macros remain undefined:
import cupy as cp
code = r'''
extern "C" __global__ void xyz(float* a) {
float x = 0;
#ifdef CUPY_JIT_MODE
x = 1;
#else
x = 2;
#endif
a[threadIdx.x] = x;
}
'''
options = ('-DCUPY_JIT_MODE',)
ker = cp.RawKernel(code, 'xyz', options=options, backend='nvrtc')
a = cp.empty((32,), dtype=cp.float32)
ker((1,), (32,), (a,))
print(a) # -> with backend='nvrtc': [2., 2., ...]; with backend='nvcc': [1., 1., ...]
cp.cuda.Device().synchronize()
- The headers
cupy/tuple.cuh
are not manually unrolled: they should be added to theextra_sources
list incupy/_core/core.pyx
thrust::swap
implementation is not recognized: Actually I don’t know how CUDA tests passed, because apparently we don’t include it in the bundled headers. I think hiprtc/hipcc is correct in not recognizing it.
I have a local fix I’m polishing for working around these issues. Not an ideal solution, though.
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
cupy/community - Gitter
Failed to import CuPy. If you installed CuPy via wheels (cupy-cudaXXX or cupy-rocm-X-X), make sure that the package matches with the version of...
Read more >Using CuPy on AMD GPU (experimental)
Run rocminfo and use the value displayed in Name: line (e.g., gfx900 ). You may also need to set ROCM_HOME (e.g., ROCM_HOME=/opt/rocm )....
Read more >Emissions | Department of Revenue - Motor Vehicle
My vehicle failed the inspection. What can I do? You could possibly obtain an emissions waiver. What is a waiver? A waiver is...
Read more >Young Spain squad and its 'tiki-taka' stumble at World Cup
Spain lost to Morocco 3-0 in a penalty shootout in the round of 16 at the World Cup on Tuesday, failing to make...
Read more >Amazon.com: EISCO Rock Cycle Kit, 12 Pieces - 1" Specimens
Buy EISCO Rock Cycle Kit, 12 Pieces - Includes Metamorphic, Igneous & Sedimentary Rocks - 1" Specimens - Fun Geology Activity for Exploring...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@takagi I reported to upstream: https://github.com/ROCm-Developer-Tools/HIP/issues/2248
hiprtc itself uses
-D
option? https://github.com/ROCm-Developer-Tools/HIP/blob/main/rocclr/hip_rtc.cpp#L222-L223