question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Jitify fails when cuSPARSELt 0.2.0 is loaded

See original GitHub issue

Description

Jitify breaks for some reason when cuSPARSELt 0.2.0 shared library is loaded.

To Reproduce

import cupy

_test_source1 = r'''
#include <cupy/cub/cub/block/block_reduce.cuh>

extern "C" __global__
void test_sum(const float* x1, const float* x2, float* y, unsigned int N) {
    int tid = blockDim.x * blockIdx.x + threadIdx.x;
    if (tid < N)
        y[tid] = x1[tid] + x2[tid];
}
'''

if __name__ == '__main__':
    import ctypes
    dll = ctypes.CDLL('/home/maehashi/local/cuda/cusparselt/libcusparse_lt/lib64/libcusparseLt.so.0.2.0.1')

    mod1 = cupy.RawModule(code=_test_source1,
                          backend='nvrtc',
                          options=(),
                          jitify=True)
    ker = mod1.get_function('test_sum')

This code raises cupy.cuda.compiler.JitifyException: Runtime compilation failed. However, the problem disappears when dll = ctypes.CDLL(...) line is commented out.

Full error output:

---------------------------------------------------
--- JIT compile log for /tmp/tmp_91sfpbc/781dd8cb9c775466a5d7241f50c2d8899f818631.cubin.cu ---
---------------------------------------------------
cupy/cub/cub/block/block_reduce.cuh(1): warning: extra text after expected end of preprocessing directive

cupy/cub/cub/block/block_reduce.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_3" and its replacement text

specializations/block_reduce_raking.cuh(1): warning: extra text after expected end of preprocessing directive

specializations/block_reduce_raking.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_7" and its replacement text

../../block/block_raking_layout.cuh(1): warning: extra text after expected end of preprocessing directive

../../block/block_raking_layout.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_6" and its replacement text

../util_macro.cuh(1): warning: extra text after expected end of preprocessing directive

../util_macro.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_B" and its replacement text

util_namespace.cuh(1): warning: extra text after expected end of preprocessing directive

util_namespace.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_E" and its replacement text

../util_arch.cuh(1): warning: extra text after expected end of preprocessing directive

../util_arch.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_A" and its replacement text

../util_type.cuh(1): warning: extra text after expected end of preprocessing directive

../util_type.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_D" and its replacement text

iostream(1): warning: extra text after expected end of preprocessing directive

iostream(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_8" and its replacement text

ostream(1): warning: extra text after expected end of preprocessing directive

ostream(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_4" and its replacement text

istream(1): warning: extra text after expected end of preprocessing directive

istream(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_1" and its replacement text

limits(1): warning: extra text after expected end of preprocessing directive

cfloat(1): warning: extra text after expected end of preprocessing directive

util_arch.cuh(1): warning: extra text after expected end of preprocessing directive

../util_type.cuh(1045): error: identifier "FLT_MAX" is undefined

../util_type.cuh(1049): error: identifier "FLT_MAX" is undefined

../util_type.cuh(1057): error: identifier "DBL_MAX" is undefined

../util_type.cuh(1061): error: identifier "DBL_MAX" is undefined

../util_type.cuh(1131): error: namespace "std" has no member "numeric_limits"

../util_type.cuh(1131): error: type name is not allowed

../util_type.cuh(1131): error: the global scope has no "is_signed"

../util_namespace.cuh(1): warning: extra text after expected end of preprocessing directive

../../warp/warp_reduce.cuh(1): warning: extra text after expected end of preprocessing directive

../../warp/warp_reduce.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_B9B" and its replacement text

specializations/warp_reduce_shfl.cuh(1): warning: extra text after expected end of preprocessing directive

specializations/warp_reduce_shfl.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_F" and its replacement text

../../thread/thread_operators.cuh(1): warning: extra text after expected end of preprocessing directive

../../util_ptx.cuh(1): warning: extra text after expected end of preprocessing directive

../../util_ptx.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_2" and its replacement text

util_type.cuh(1): warning: extra text after expected end of preprocessing directive

util_debug.cuh(1): warning: extra text after expected end of preprocessing directive

util_debug.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_C" and its replacement text

stddef.h(1): warning: extra text after expected end of preprocessing directive

../../util_type.cuh(1): warning: extra text after expected end of preprocessing directive

../../util_macro.cuh(1): warning: extra text after expected end of preprocessing directive

../../util_namespace.cuh(1): warning: extra text after expected end of preprocessing directive

specializations/warp_reduce_shfl.cuh(136): error: namespace "cub" has no member "Sum"

specializations/warp_reduce_shfl.cuh(173): error: namespace "cub" has no member "Sum"

specializations/warp_reduce_shfl.cuh(210): error: namespace "cub" has no member "Sum"

specializations/warp_reduce_shfl.cuh(252): error: namespace "cub" has no member "Sum"

specializations/warp_reduce_shfl.cuh(295): error: namespace "cub" has no member "Sum"

specializations/warp_reduce_shfl.cuh(343): error: SwizzleScanOp is not a template

specializations/warp_reduce_shfl.cuh(343): error: identifier "ReduceByKeyOp" is undefined

specializations/warp_reduce_shfl.cuh(343): error: namespace "cub" has no member "Sum"

specializations/warp_reduce_shfl.cuh(343): error: expected a ")"

specializations/warp_reduce_shfl.cuh(371): error: SwizzleScanOp is not a template

specializations/warp_reduce_shfl.cuh(371): error: identifier "ReduceBySegmentOp" is undefined

specializations/warp_reduce_shfl.cuh(371): error: namespace "cub" has no member "Sum"

specializations/warp_reduce_shfl.cuh(371): error: expected a ")"

specializations/warp_reduce_shfl.cuh(369): error: invalid redeclaration of member function template "cub::KeyValuePair<KeyT, ValueT> cub::WarpReduceShfl<T, LOGICAL_WARP_THREADS, PTX_ARCH>::ReduceStep(cub::KeyValuePair<KeyT, ValueT>, <error-type>)"
(341): here

specializations/warp_reduce_smem.cuh(1): warning: extra text after expected end of preprocessing directive

specializations/warp_reduce_smem.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_5E7" and its replacement text

../../thread/thread_load.cuh(1): warning: extra text after expected end of preprocessing directive

../../thread/thread_store.cuh(1): warning: extra text after expected end of preprocessing directive

../thread/thread_operators.cuh(1): warning: extra text after expected end of preprocessing directive

../../thread/thread_reduce.cuh(1): warning: extra text after expected end of preprocessing directive

specializations/block_reduce_raking_commutative_only.cuh(1): warning: extra text after expected end of preprocessing directive

specializations/block_reduce_raking_commutative_only.cuh(2): warning: white space is required between the macro name "_JITIFY_INCLUDE_GUARD_E4" and its replacement text

specializations/block_reduce_raking_commutative_only.cuh(98): warning: declaration does not declare anything

specializations/block_reduce_warp_reductions.cuh(1): warning: extra text after expected end of preprocessing directive

../util_ptx.cuh(1): warning: extra text after expected end of preprocessing directive

cupy/cub/cub/block/block_reduce.cuh(236): error: BlockReduceWarpReductions is not a template

22 errors detected in the compilation of "/tmp/tmp_91sfpbc/781dd8cb9c775466a5d7241f50c2d8899f818631.cubin.cu".

---------------------------------------------------
Traceback (most recent call last):
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 250, in _jitify_prep
    name, options, headers, include_names = jitify(
  File "cupy/cuda/jitify.pyx", line 59, in cupy.cuda.jitify.jitify
    cpdef jitify(str code, tuple opt, dict cached_sources=None):
  File "cupy/cuda/jitify.pyx", line 92, in cupy.cuda.jitify.jitify
    load_program(cuda_source, headers, nullptr, &include_paths,
RuntimeError: Runtime compilation failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/maehashi/Development/cupy/test.py", line 22, in <module>
    ker = mod1.get_function('test_sum')
  File "cupy/_core/raw.pyx", line 485, in cupy._core.raw.RawModule.get_function
    func = ker.kernel  # noqa
  File "cupy/_core/raw.pyx", line 96, in cupy._core.raw.RawKernel.kernel.__get__
    return self._kernel()
  File "cupy/_core/raw.pyx", line 113, in cupy._core.raw.RawKernel._kernel
    mod = _get_raw_module(
  File "cupy/_util.pyx", line 67, in cupy._util.memoize.decorator.ret
    result = f(*args, **kwargs)
  File "cupy/_core/raw.pyx", line 547, in cupy._core.raw._get_raw_module
    mod = cupy._core.core.compile_with_cache(
  File "cupy/_core/core.pyx", line 2062, in cupy._core.core.compile_with_cache
    cpdef function.Module compile_with_cache(
  File "cupy/_core/core.pyx", line 2122, in cupy._core.core.compile_with_cache
    return cuda.compiler._compile_module_with_cache(
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 487, in _compile_module_with_cache
    return _compile_with_cache_cuda(
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 565, in _compile_with_cache_cuda
    ptx, mapping = compile_using_nvrtc(
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 316, in compile_using_nvrtc
    return _compile(source, options, cu_path,
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 285, in _compile
    options, headers, include_names = _jitify_prep(
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 258, in _jitify_prep
    raise JitifyException(str(cex))
cupy.cuda.compiler.JitifyException: Runtime compilation failed

Installation

Source (pip install cupy)

Environment

OS                           : Linux-4.15.0-153-generic-x86_64-with-glibc2.27
Python Version               : 3.9.5
CuPy Version                 : 10.0.0rc1
CuPy Platform                : NVIDIA CUDA
NumPy Version                : 1.21.2
SciPy Version                : 1.7.1
Cython Build Version         : 0.29.24
Cython Runtime Version       : 0.29.24
CUDA Root                    : /usr/local/cuda-11.4.1
nvcc PATH                    : ccache nvcc
CUDA Build Version           : 11040
CUDA Driver Version          : 11050
CUDA Runtime Version         : 11040
cuBLAS Version               : (available)
cuFFT Version                : 10501
cuRAND Version               : 10205
cuSOLVER Version             : (11, 2, 0)
cuSPARSE Version             : (available)
NVRTC Version                : (11, 4)
Thrust Version               : 101201
CUB Build Version            : 101201
Jitify Build Version         : 60e9e72
cuDNN Build Version          : None
cuDNN Version                : None
NCCL Build Version           : None
NCCL Runtime Version         : None
cuTENSOR Version             : None
cuSPARSELt Build Version     : None
Device 0 Name                : Tesla P100-PCIE-16GB
Device 0 Compute Capability  : 60
Device 0 PCI Bus ID          : 0000:1A:00.0
Device 1 Name                : Tesla P100-PCIE-16GB
Device 1 Compute Capability  : 60
Device 1 PCI Bus ID          : 0000:3D:00.0
Device 2 Name                : Tesla P100-PCIE-16GB
Device 2 Compute Capability  : 60
Device 2 PCI Bus ID          : 0000:B1:00.0

Additional Information

cc/ @leofang

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:8 (8 by maintainers)

github_iconTop GitHub Comments

5reactions
kmaehashicommented, Nov 20, 2021

It seems that cuSPARSELt 0.2.0 automatically calls std::locale::global to inject comma when formatting a number, and that messes up Jitify’s include guard generation code.

#include <string>
#include <sstream>
#include <iostream>
#include <iomanip>
#include <dlfcn.h>


int main() {
    dlopen("/home/maehashi/local/cuda/cusparselt/libcusparse_lt/lib64/libcusparseLt.so.0.2.0.1", 2);

    // quote from Jitify code
    int hash = 999999999;
    std::stringstream ss;
    ss << std::uppercase << std::hex << std::setw(8) << std::setfill('0')
           << hash;
    std::string include_guard_name = "_JITIFY_INCLUDE_GUARD_" + ss.str() + "\n";
    std::cout << include_guard_name << std::endl;
}

with dlopen:

% g++ test.cpp -ldl ; ./a.out
_JITIFY_INCLUDE_GUARD_3B,9AC,9FF

without dlopen: (expected)

% g++ test.cpp -ldl ; ./a.out
_JITIFY_INCLUDE_GUARD_3B9AC9FF

As ,9AC,9FF part is ignored by the compiler, include guards are not working as expected and causing compilation failure.

2reactions
kmaehashicommented, Nov 21, 2021

Yes, confirmed that adding ss.imbue(std::locale("C")); in jitify.hpp workarounds the problem.

Read more comments on GitHub >

github_iconTop Results From Across the Web

cuSparseLT question · Issue #17
1 to a newer version (0.1.0, 0.2.0, or 0.3.0)? From which version cusparseLt supports matrix B to be sparse? I have a low...
Read more >
Getting Started — NVIDIA cuSPARSELt 0.3.0 documentation
In this section, we show how to implement a sparse matrix-matrix multiplication using cuSPARSELt. We first introduce an overview of the workflow by...
Read more >
cudaModuleLoadData fails with error code 201 - c++
CUfunction CUDAPipelineKernel; //initializing cuda driver cudaErrorVariable = cuInit(0); //checking for error while loading ptx code in CUmodule.
Read more >
A NumPy-compatible array library accelerated by CUDA
CuFFTError: CUFFT_INTERNAL_ERROR cupy/cuda/cufft.pyx:147: CuFFTError ... Support ROCm 5.0 (#6496); Support cuSPARSELt 0.2.0 (repost) (#6507) ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found