Build failure from source with USE_FP16=ON with CUDA10.2 and Volta architecture
See original GitHub issue🐛 Bug
Hello, I’m trying to build dgl with fp16 support using the master branch. I used the following cmake flags
cmake -DUSE_CUDA=ON -DCUDA_ARCH_NAME=Manual -DCUDA_ARCH_BIN="70 75" -DUSE_AVX=OFF -DBUILD_TORCH=ON -DUSE_FP16=ON ..
and when I do make, it succeeds in compiling tensoradapter for torch and all cpu kernels and starts printing bunch of these errors and fails
/ccs/home/skrsna/dgl/src/array/cuda/./sddmm.cuh(113): error: more than one conversion function from "const half" to a built-in type applies:
function "__half::operator float() const"
function "__half::operator short() const"
function "__half::operator unsigned short() const"
function "__half::operator int() const"
function "__half::operator unsigned int() const"
function "__half::operator long long() const"
function "__half::operator unsigned long long() const"
function "__half::operator __nv_bool() const"
detected during:
instantiation of "void dgl::aten::cuda::SDDMMCooTreeReduceKernel(const DType *, const DType *, DType *, const Idx *, const Idx *, const Idx *, int64_t, int64_t, int64_t, int64_t, const int64_t *, const int64_t *, int64_t, int64_t, int64_t) [with Idx=int32_t, DType=half, UseBcast=false, UseIdx=false, LhsTarget=1, RhsTarget=1]"
(235): here
instantiation of "void dgl::aten::cuda::SDDMMCoo<Idx,DType,Op,LhsTarget,RhsTarget>(const dgl::BcastOff &, const dgl::aten::COOMatrix &, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray) [with Idx=int32_t, DType=half, Op=dgl::aten::cuda::binary::Add<half>, LhsTarget=1, RhsTarget=1]"
/ccs/home/skrsna/dgl/src/array/cuda/sddmm.cu(106): here
instantiation of "void dgl::aten::SDDMMCoo<XPU,IdType,bits>(const std::__cxx11::string &, const dgl::BcastOff &, const dgl::aten::COOMatrix &, dgl::runtime::NDArray, dgl::runtime::NDArray, dgl::runtime::NDArray, int, int) [with XPU=2, IdType=int32_t, bits=16]"
/ccs/home/skrsna/dgl/src/array/cuda/sddmm.cu(140): here
Error limit reached.
100 errors detected in the compilation of "/tmp/tmpxft_000163b3_00000000-8_sddmm.compute_50.cpp1.ii".
Compilation terminated.
CMake Error at dgl_generated_sddmm.cu.o.cmake:276 (message):
Error generating file
/ccs/home/skrsna/dgl/build/CMakeFiles/dgl.dir/src/array/cuda/./dgl_generated_sddmm.cu.o
make[2]: *** [CMakeFiles/dgl.dir/build.make:4669: CMakeFiles/dgl.dir/src/array/cuda/dgl_generated_sddmm.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:166: CMakeFiles/dgl.dir/all] Error 2
make: *** [Makefile:149: all] Error 2
I tried this with gcc versions 7.4.0/8.1.0 and 8.1.1 to no avail. @romerojosh also reported similar compilation errors on DGX V100 machine (correct me if I’m wrong). Any help with this? Thanks 🙂
Expected behavior
Environment
- DGL Version (e.g., 1.0):
- 0.6 (from master with this hash
db57809da147c663a8369c554986ca9d0b19f0ea)
- 0.6 (from master with this hash
- Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3):
- pytorch 1.7.1
- OS (e.g., Linux):
- RHEL 7.6
- How you installed DGL (
conda,pip, source):- source
- Build command you used (if compiling from source):
cmake -DUSE_CUDA=ON -DCUDA_ARCH_NAME=Manual -DCUDA_ARCH_BIN="70 75" -DUSE_AVX=OFF -DBUILD_TORCH=ON -DUSE_FP16=ON ..
- Python version:
- 3.8
- CUDA/cuDNN version (if applicable):
- 10.2.89/7.6.5_10.2
- GPU models and configuration (e.g. V100):
- V100
- Any other relevant information:
Additional context
- cmake output
-- The C compiler identification is GNU 8.1.1
-- The CXX compiler identification is GNU 8.1.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /sw/summit/gcc/8.1.1/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /sw/summit/gcc/8.1.1/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Start configuring project dgl
-- Build with CUDA support
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Found CUDA: /sw/summit/cuda/10.2.89 (found version "10.2")
-- Found CUDA_TOOLKIT_ROOT_DIR=/sw/summit/cuda/10.2.89
-- Found CUDA_CUDART_LIBRARY=/sw/summit/cuda/10.2.89/lib64/libcudart.so
-- Found CUDA_CUBLAS_LIBRARY=/ccs/home/skrsna/.conda/envs/builder/lib/libcublas.so
-- Performing Test SUPPORT_CXX14
-- Performing Test SUPPORT_CXX14 - Success
-- Detected CUDA of version 10.2. Use external CUB/Thrust library.
-- Performing Test SUPPORT_CXX11
-- Performing Test SUPPORT_CXX11 - Success
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- Build with OpenMP.
-- Build with fp16 to support mixed precision training
-- -fopenmp -O2 -Wall -fPIC -std=c++11 -DUSE_FP16 -DIDXTYPEWIDTH=64 -DREALTYPEWIDTH=32
-- CUDA flags: -Xcompiler ,-fopenmp,-O2,-Wall,-fPIC,,,-DUSE_FP16,-DIDXTYPEWIDTH=64,-DREALTYPEWIDTH=32;--expt-relaxed-constexpr;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50;--expt-extended-lambda;-Wno-deprecated-declarations;-std=c++14
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Looking for clock_gettime in rt
-- Looking for clock_gettime in rt - found
-- Looking for fopen64
-- Looking for fopen64 - not found
-- Looking for C++ include cxxabi.h
-- Looking for C++ include cxxabi.h - found
-- Looking for nanosleep
-- Looking for nanosleep - found
-- Looking for backtrace
-- Looking for backtrace - found
-- backtrace facility detected in default set of libraries
-- Found Backtrace: /usr/include
-- Check if the system is big endian
-- Searching 16 bit integer
-- Looking for sys/types.h
-- Looking for sys/types.h - found
-- Looking for stdint.h
-- Looking for stdint.h - found
-- Looking for stddef.h
-- Looking for stddef.h - found
-- Check size of unsigned short
-- Check size of unsigned short - done
-- Searching 16 bit integer - Using unsigned short
-- Check if the system is big endian - little endian
-- /ccs/home/skrsna/dgl/third_party/dmlc-core/cmake/build_config.h.in -> include/dmlc/build_config.h
-- Looking for execinfo.h
-- Looking for execinfo.h - found
-- Looking for getline
-- Looking for getline - found
-- Configuring done
-- Generating done
-- Build files have been written to: /ccs/home/skrsna/dgl/build
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:9 (3 by maintainers)
Top Results From Across the Web
Volta Compatibility Guide :: CUDA Toolkit Documentation
Volta Compatibility Guide for CUDA Applications. The guide to building CUDA applications for GPUs based on the NVIDIA Volta Architecture.
Read more >NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not ...
Hello, I'm getting following error: NVIDIA GeForce RTX 3090 with CUDA capability sm_86 is not compatible with the current PyTorch installation.
Read more >CUDA - Wikipedia
CUDA (or Compute Unified Device Architecture) is a parallel computing platform and application programming interface (API) that allows software to use ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Hi, must I compile dgl from source code if I want to use mixed precision feature? Can I just use pip? I tried
pip install dgl-cu101==0.6.0? However it did not work, I got:I also tried to build dgl using
cmake -DUSE_CUDA=ON -DUSE_FP16=ON .., but failed, too. The err is:My env:
Hi @yzh119, I failed to compile the dgl source files on the same step as @skrsna. Although the cmake output is a different.
cmake output
Could please specify which right command I should use? Thanks! And some of the environment information is as follows:
Environment information