Jitify fails when cuSPARSELt 0.2.0 is loaded
See original GitHub issueDescription
Jitify breaks for some reason when cuSPARSELt 0.2.0 shared library is loaded.
To Reproduce
import cupy
_test_source1 = r'''
#include <cupy/cub/cub/block/block_reduce.cuh>
extern "C" __global__
void test_sum(const float* x1, const float* x2, float* y, unsigned int N) {
int tid = blockDim.x * blockIdx.x + threadIdx.x;
if (tid < N)
y[tid] = x1[tid] + x2[tid];
}
'''
if __name__ == '__main__':
import ctypes
dll = ctypes.CDLL('/home/maehashi/local/cuda/cusparselt/libcusparse_lt/lib64/libcusparseLt.so.0.2.0.1')
mod1 = cupy.RawModule(code=_test_source1,
backend='nvrtc',
options=(),
jitify=True)
ker = mod1.get_function('test_sum')
This code raises cupy.cuda.compiler.JitifyException: Runtime compilation failed
.
However, the problem disappears when dll = ctypes.CDLL(...)
line is commented out.
Full error output:
---------------------------------------------------
--- JIT compile log for /tmp/tmp_91sfpbc/781dd8cb9c775466a5d7241f50c2d8899f818631.cubin.cu ---
---------------------------------------------------
cupy/cub/cub/block/block_reduce.cuh(1): warning: extra text after expected end of preprocessing directive
cupy/cub/cub/block/block_reduce.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_3" and its replacement text
specializations/block_reduce_raking.cuh(1): warning: extra text after expected end of preprocessing directive
specializations/block_reduce_raking.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_7" and its replacement text
../../block/block_raking_layout.cuh(1): warning: extra text after expected end of preprocessing directive
../../block/block_raking_layout.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_6" and its replacement text
../util_macro.cuh(1): warning: extra text after expected end of preprocessing directive
../util_macro.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_B" and its replacement text
util_namespace.cuh(1): warning: extra text after expected end of preprocessing directive
util_namespace.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_E" and its replacement text
../util_arch.cuh(1): warning: extra text after expected end of preprocessing directive
../util_arch.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_A" and its replacement text
../util_type.cuh(1): warning: extra text after expected end of preprocessing directive
../util_type.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_D" and its replacement text
iostream(1): warning: extra text after expected end of preprocessing directive
iostream(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_8" and its replacement text
ostream(1): warning: extra text after expected end of preprocessing directive
ostream(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_4" and its replacement text
istream(1): warning: extra text after expected end of preprocessing directive
istream(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_1" and its replacement text
limits(1): warning: extra text after expected end of preprocessing directive
cfloat(1): warning: extra text after expected end of preprocessing directive
util_arch.cuh(1): warning: extra text after expected end of preprocessing directive
../util_type.cuh(1045): error: identifier "FLT_MAX" is undefined
../util_type.cuh(1049): error: identifier "FLT_MAX" is undefined
../util_type.cuh(1057): error: identifier "DBL_MAX" is undefined
../util_type.cuh(1061): error: identifier "DBL_MAX" is undefined
../util_type.cuh(1131): error: namespace "std" has no member "numeric_limits"
../util_type.cuh(1131): error: type name is not allowed
../util_type.cuh(1131): error: the global scope has no "is_signed"
../util_namespace.cuh(1): warning: extra text after expected end of preprocessing directive
../../warp/warp_reduce.cuh(1): warning: extra text after expected end of preprocessing directive
../../warp/warp_reduce.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_B9B" and its replacement text
specializations/warp_reduce_shfl.cuh(1): warning: extra text after expected end of preprocessing directive
specializations/warp_reduce_shfl.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_F" and its replacement text
../../thread/thread_operators.cuh(1): warning: extra text after expected end of preprocessing directive
../../util_ptx.cuh(1): warning: extra text after expected end of preprocessing directive
../../util_ptx.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_2" and its replacement text
util_type.cuh(1): warning: extra text after expected end of preprocessing directive
util_debug.cuh(1): warning: extra text after expected end of preprocessing directive
util_debug.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_C" and its replacement text
stddef.h(1): warning: extra text after expected end of preprocessing directive
../../util_type.cuh(1): warning: extra text after expected end of preprocessing directive
../../util_macro.cuh(1): warning: extra text after expected end of preprocessing directive
../../util_namespace.cuh(1): warning: extra text after expected end of preprocessing directive
specializations/warp_reduce_shfl.cuh(136): error: namespace "cub" has no member "Sum"
specializations/warp_reduce_shfl.cuh(173): error: namespace "cub" has no member "Sum"
specializations/warp_reduce_shfl.cuh(210): error: namespace "cub" has no member "Sum"
specializations/warp_reduce_shfl.cuh(252): error: namespace "cub" has no member "Sum"
specializations/warp_reduce_shfl.cuh(295): error: namespace "cub" has no member "Sum"
specializations/warp_reduce_shfl.cuh(343): error: SwizzleScanOp is not a template
specializations/warp_reduce_shfl.cuh(343): error: identifier "ReduceByKeyOp" is undefined
specializations/warp_reduce_shfl.cuh(343): error: namespace "cub" has no member "Sum"
specializations/warp_reduce_shfl.cuh(343): error: expected a ")"
specializations/warp_reduce_shfl.cuh(371): error: SwizzleScanOp is not a template
specializations/warp_reduce_shfl.cuh(371): error: identifier "ReduceBySegmentOp" is undefined
specializations/warp_reduce_shfl.cuh(371): error: namespace "cub" has no member "Sum"
specializations/warp_reduce_shfl.cuh(371): error: expected a ")"
specializations/warp_reduce_shfl.cuh(369): error: invalid redeclaration of member function template "cub::KeyValuePair<KeyT, ValueT> cub::WarpReduceShfl<T, LOGICAL_WARP_THREADS, PTX_ARCH>::ReduceStep(cub::KeyValuePair<KeyT, ValueT>, <error-type>)"
(341): here
specializations/warp_reduce_smem.cuh(1): warning: extra text after expected end of preprocessing directive
specializations/warp_reduce_smem.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_5E7" and its replacement text
../../thread/thread_load.cuh(1): warning: extra text after expected end of preprocessing directive
../../thread/thread_store.cuh(1): warning: extra text after expected end of preprocessing directive
../thread/thread_operators.cuh(1): warning: extra text after expected end of preprocessing directive
../../thread/thread_reduce.cuh(1): warning: extra text after expected end of preprocessing directive
specializations/block_reduce_raking_commutative_only.cuh(1): warning: extra text after expected end of preprocessing directive
specializations/block_reduce_raking_commutative_only.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_E4" and its replacement text
specializations/block_reduce_raking_commutative_only.cuh(98): warning: declaration does not declare anything
specializations/block_reduce_warp_reductions.cuh(1): warning: extra text after expected end of preprocessing directive
../util_ptx.cuh(1): warning: extra text after expected end of preprocessing directive
cupy/cub/cub/block/block_reduce.cuh(236): error: BlockReduceWarpReductions is not a template
22 errors detected in the compilation of "/tmp/tmp_91sfpbc/781dd8cb9c775466a5d7241f50c2d8899f818631.cubin.cu".
---------------------------------------------------
Traceback (most recent call last):
File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 250, in _jitify_prep
name, options, headers, include_names = jitify(
File "cupy/cuda/jitify.pyx", line 59, in cupy.cuda.jitify.jitify
cpdef jitify(str code, tuple opt, dict cached_sources=None):
File "cupy/cuda/jitify.pyx", line 92, in cupy.cuda.jitify.jitify
load_program(cuda_source, headers, nullptr, &include_paths,
RuntimeError: Runtime compilation failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/maehashi/Development/cupy/test.py", line 22, in <module>
ker = mod1.get_function('test_sum')
File "cupy/_core/raw.pyx", line 485, in cupy._core.raw.RawModule.get_function
func = ker.kernel # noqa
File "cupy/_core/raw.pyx", line 96, in cupy._core.raw.RawKernel.kernel.__get__
return self._kernel()
File "cupy/_core/raw.pyx", line 113, in cupy._core.raw.RawKernel._kernel
mod = _get_raw_module(
File "cupy/_util.pyx", line 67, in cupy._util.memoize.decorator.ret
result = f(*args, **kwargs)
File "cupy/_core/raw.pyx", line 547, in cupy._core.raw._get_raw_module
mod = cupy._core.core.compile_with_cache(
File "cupy/_core/core.pyx", line 2062, in cupy._core.core.compile_with_cache
cpdef function.Module compile_with_cache(
File "cupy/_core/core.pyx", line 2122, in cupy._core.core.compile_with_cache
return cuda.compiler._compile_module_with_cache(
File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 487, in _compile_module_with_cache
return _compile_with_cache_cuda(
File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 565, in _compile_with_cache_cuda
ptx, mapping = compile_using_nvrtc(
File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 316, in compile_using_nvrtc
return _compile(source, options, cu_path,
File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 285, in _compile
options, headers, include_names = _jitify_prep(
File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 258, in _jitify_prep
raise JitifyException(str(cex))
cupy.cuda.compiler.JitifyException: Runtime compilation failed
Installation
Source (pip install cupy
)
Environment
OS : Linux-4.15.0-153-generic-x86_64-with-glibc2.27
Python Version : 3.9.5
CuPy Version : 10.0.0rc1
CuPy Platform : NVIDIA CUDA
NumPy Version : 1.21.2
SciPy Version : 1.7.1
Cython Build Version : 0.29.24
Cython Runtime Version : 0.29.24
CUDA Root : /usr/local/cuda-11.4.1
nvcc PATH : ccache nvcc
CUDA Build Version : 11040
CUDA Driver Version : 11050
CUDA Runtime Version : 11040
cuBLAS Version : (available)
cuFFT Version : 10501
cuRAND Version : 10205
cuSOLVER Version : (11, 2, 0)
cuSPARSE Version : (available)
NVRTC Version : (11, 4)
Thrust Version : 101201
CUB Build Version : 101201
Jitify Build Version : 60e9e72
cuDNN Build Version : None
cuDNN Version : None
NCCL Build Version : None
NCCL Runtime Version : None
cuTENSOR Version : None
cuSPARSELt Build Version : None
Device 0 Name : Tesla P100-PCIE-16GB
Device 0 Compute Capability : 60
Device 0 PCI Bus ID : 0000:1A:00.0
Device 1 Name : Tesla P100-PCIE-16GB
Device 1 Compute Capability : 60
Device 1 PCI Bus ID : 0000:3D:00.0
Device 2 Name : Tesla P100-PCIE-16GB
Device 2 Compute Capability : 60
Device 2 PCI Bus ID : 0000:B1:00.0
Additional Information
cc/ @leofang
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:8 (8 by maintainers)
Top Results From Across the Web
cuSparseLT question · Issue #17
1 to a newer version (0.1.0, 0.2.0, or 0.3.0)? From which version cusparseLt supports matrix B to be sparse? I have a low...
Read more >Getting Started — NVIDIA cuSPARSELt 0.3.0 documentation
In this section, we show how to implement a sparse matrix-matrix multiplication using cuSPARSELt. We first introduce an overview of the workflow by...
Read more >cudaModuleLoadData fails with error code 201 - c++
CUfunction CUDAPipelineKernel; //initializing cuda driver cudaErrorVariable = cuInit(0); //checking for error while loading ptx code in CUmodule.
Read more >A NumPy-compatible array library accelerated by CUDA
CuFFTError: CUFFT_INTERNAL_ERROR cupy/cuda/cufft.pyx:147: CuFFTError ... Support ROCm 5.0 (#6496); Support cuSPARSELt 0.2.0 (repost) (#6507) ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
It seems that cuSPARSELt 0.2.0 automatically calls
std::locale::global
to inject comma when formatting a number, and that messes up Jitify’s include guard generation code.with
dlopen
:without
dlopen
: (expected)As
,9AC,9FF
part is ignored by the compiler, include guards are not working as expected and causing compilation failure.Yes, confirmed that adding
ss.imbue(std::locale("C"));
injitify.hpp
workarounds the problem.