Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

warp-ctc make error ( identifier "__shfl_down" is undefined )

See original GitHub issue

Hi, @SeanNaren

I have a trouble with building warp-ctc. After successful cmake and then I type make,

[ 11%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o
/home/jonghu/ds2/warp-ctc/src/reduce.cu(44): error: identifier "__shfl_down" is undefined
          detected during:
            instantiation of "T CTAReduce<NT, T, Rop>::reduce(int, T, CTAReduce<NT, T, Rop>::Storage &, int, Rop) [with NT=128, T=float, Rop=ctc_helper::add<float, float>]"
(76): here
            instantiation of "void reduce_rows<NT,Iop,Rop,T>(Iop, Rop, const T *, T *, int, int) [with NT=128, Iop=ctc_helper::negate<float, float>, Rop=ctc_helper::add<float, float>, T=float]"
(124): here
            instantiation of "void ReduceHelper::impl(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::negate<float, float>, Rof=ctc_helper::add<float, float>]"
(139): here
            instantiation of "ctcStatus_t reduce(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::negate<float, float>, Rof=ctc_helper::add<float, float>]"
(149): here

/home/jonghu/ds2/warp-ctc/src/reduce.cu(44): error: identifier "__shfl_down" is undefined
          detected during:
            instantiation of "T CTAReduce<NT, T, Rop>::reduce(int, T, CTAReduce<NT, T, Rop>::Storage &, int, Rop) [with NT=128, T=float, Rop=ctc_helper::maximum<float, float>]"
(76): here
            instantiation of "void reduce_rows<NT,Iop,Rop,T>(Iop, Rop, const T *, T *, int, int) [with NT=128, Iop=ctc_helper::identity<float, float>, Rop=ctc_helper::maximum<float, float>, T=float]"
(124): here
            instantiation of "void ReduceHelper::impl(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::identity<float, float>, Rof=ctc_helper::maximum<float, float>]"
(139): here
            instantiation of "ctcStatus_t reduce(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::identity<float, float>, Rof=ctc_helper::maximum<float, float>]"
(157): here

2 errors detected in the compilation of "/tmp/tmpxft_0000636d_00000000-13_reduce.compute_70.cpp1.ii".
CMake Error at warpctc_generated_reduce.cu.o.cmake:279 (message):
  Error generating file
  /home/jonghu/ds2/warp-ctc/build/CMakeFiles/warpctc.dir/src/./warpctc_generated_reduce.cu.o


CMakeFiles/warpctc.dir/build.make:337: recipe for target 'CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o' failed
make[2]: *** [CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o] Error 1
CMakeFiles/Makefile2:109: recipe for target 'CMakeFiles/warpctc.dir/all' failed
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

this error occurs.

I’ve researched some and found out that __shfl_down() is deprecated and deleted from high version device ( link ) so needs to be changed to __shfl_down_sync().

But when I change __shfl_down() to __shfl_down_sync() in warp-ctc/src/reduce.cu,

[ 11%] Building NVCC (Device) object CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o
/home/jonghu/ds2/warp-ctc/src/reduce.cu(44): error: no instance of overloaded function "__shfl_down_sync" matches the argument list
            argument types are: (float, int)
          detected during:
            instantiation of "T CTAReduce<NT, T, Rop>::reduce(int, T, CTAReduce<NT, T, Rop>::Storage &, int, Rop) [with NT=128, T=float, Rop=ctc_helper::add<float, float>]"
(76): here
            instantiation of "void reduce_rows<NT,Iop,Rop,T>(Iop, Rop, const T *, T *, int, int) [with NT=128, Iop=ctc_helper::negate<float, float>, Rop=ctc_helper::add<float, float>, T=float]"
(124): here
            instantiation of "void ReduceHelper::impl(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::negate<float, float>, Rof=ctc_helper::add<float, float>]"
(139): here
            instantiation of "ctcStatus_t reduce(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::negate<float, float>, Rof=ctc_helper::add<float, float>]"
(149): here

/home/jonghu/ds2/warp-ctc/src/reduce.cu(44): error: no instance of overloaded function "__shfl_down_sync" matches the argument list
            argument types are: (float, int)
          detected during:
            instantiation of "T CTAReduce<NT, T, Rop>::reduce(int, T, CTAReduce<NT, T, Rop>::Storage &, int, Rop) [with NT=128, T=float, Rop=ctc_helper::maximum<float, float>]"
(76): here
            instantiation of "void reduce_rows<NT,Iop,Rop,T>(Iop, Rop, const T *, T *, int, int) [with NT=128, Iop=ctc_helper::identity<float, float>, Rop=ctc_helper::maximum<float, float>, T=float]"
(124): here
            instantiation of "void ReduceHelper::impl(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::identity<float, float>, Rof=ctc_helper::maximum<float, float>]"
(139): here
            instantiation of "ctcStatus_t reduce(Iof, Rof, const T *, T *, int, int, __nv_bool, cudaStream_t) [with T=float, Iof=ctc_helper::identity<float, float>, Rof=ctc_helper::maximum<float, float>]"
(157): here

2 errors detected in the compilation of "/tmp/tmpxft_000063c5_00000000-13_reduce.compute_70.cpp1.ii".
CMake Error at warpctc_generated_reduce.cu.o.cmake:279 (message):
  Error generating file
  /home/jonghu/ds2/warp-ctc/build/CMakeFiles/warpctc.dir/src/./warpctc_generated_reduce.cu.o


CMakeFiles/warpctc.dir/build.make:337: recipe for target 'CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o' failed
make[2]: *** [CMakeFiles/warpctc.dir/src/warpctc_generated_reduce.cu.o] Error 1
CMakeFiles/Makefile2:109: recipe for target 'CMakeFiles/warpctc.dir/all' failed
make[1]: *** [CMakeFiles/warpctc.dir/all] Error 2
Makefile:129: recipe for target 'all' failed
make: *** [all] Error 2

this error occurs.

My GPU is GeForce RTX 2080 Ti which failed with CUDA version 9.0, 9.1, and 10.1. Is there a way to solve this issue?

Sincerely, Jonghu.

Issue Analytics

State:
Created 5 years ago
Comments:12 (2 by maintainers)

Top GitHub Comments

62reactions

tq09mx5commented, Mar 8, 2019

src/reduce.cu Line 44 to : shuff = __shfl_down_sync(0xFFFFFFFF, x, offset);

include/contrib/moderngpu/include/device/intrinsics.cuh Line 115 to : var = __shfl_up_sync(0xFFFFFFFF, var, delta, width); Line 125 to : p.x = __shfl_up_sync(0xFFFFFFFF, p.x, delta, width); Line 126 to : p.y = __shfl_up_sync(0xFFFFFFFF, p.y, delta, width); Line 143 to : “shfl.up.sync.b32 r0|p, %1, %2, %3, %4;” Line 158 to : “shfl.up.sync.b32 r0|p, %1, %2, %3, %4;”

works fine with CUDA 10.1

4reactions

zhenglileicommented, Oct 15, 2019

src/reduce.cu Line 44 to : shuff = __shfl_down_sync(0xFFFFFFFF, x, offset);

include/contrib/moderngpu/include/device/intrinsics.cuh Line 115 to : var = __shfl_up_sync(0xFFFFFFFF, var, delta, width); Line 125 to : p.x = __shfl_up_sync(0xFFFFFFFF, p.x, delta, width); Line 126 to : p.y = __shfl_up_sync(0xFFFFFFFF, p.y, delta, width); Line 143 to : “shfl.up.sync.b32 r0|p, %1, %2, %3, %4;” Line 158 to : “shfl.up.sync.b32 r0|p, %1, %2, %3, %4;”

works fine with CUDA 10.1

This is the correct solution by Oct. 2019.