question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Trouble building with cuda_ext

See original GitHub issue

Getting this error when trying to build with --cuda_ext. I’m on a GTX 1060 with PyTorch 1.0, gcc version 4.9.4 (Ubuntu 4.9.4-2ubuntu1)

torch.__version__  =  1.0.0
running install
running bdist_egg
running egg_info
writing apex.egg-info/PKG-INFO
writing dependency_links to apex.egg-info/dependency_links.txt
writing top-level names to apex.egg-info/top_level.txt
reading manifest file 'apex.egg-info/SOURCES.txt'
writing manifest file 'apex.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'syncbn' extension
gcc -pthread -B /home/chang/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include/TH -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/chang/anaconda3/include/python3.7m -c csrc/syncbn.cpp -o build/temp.linux-x86_64-3.7/csrc/syncbn.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=syncbn -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
/usr/local/cuda/bin/nvcc -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include/torch/csrc/api/include -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include/TH -I/home/chang/anaconda3/lib/python3.7/site-packages/torch/lib/include/THC -I/usr/local/cuda/include -I/home/chang/anaconda3/include/python3.7m -c csrc/welford.cu -o build/temp.linux-x86_64-3.7/csrc/welford.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --compiler-options '-fPIC' -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=syncbn -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
csrc/welford.cu(82): error: identifier "__shfl_down_sync" is undefined
          detected during:
            instantiation of "void welford_reduce_mean_m2n(T *, int *, T &, T &, int &, int, int) [with T=at::acc_type<double, true>]" 
(184): here
            instantiation of "void welford_kernel<scalar_t,accscalar_t,outscalar_t>(const scalar_t *, outscalar_t *, outscalar_t *, outscalar_t *, int, int, int) [with scalar_t=double, accscalar_t=at::acc_type<double, true>, outscalar_t=at::acc_type<double, true>]" 
(364): here

csrc/welford.cu(49): error: identifier "__shfl_down_sync" is undefined
          detected during:
            instantiation of "T warp_reduce_sum(T) [with T=at::acc_type<double, true>]" 
(60): here
            instantiation of "T reduce_block(T *, T) [with T=at::acc_type<double, true>]" 
(268): here
            instantiation of "void reduce_bn_kernel(const scalar_t *, const scalar_t *, const accscalar_t *, const accscalar_t *, accscalar_t *, accscalar_t *, layerscalar_t *, layerscalar_t *, int, int, int, float) [with scalar_t=double, accscalar_t=at::acc_type<double, true>, layerscalar_t=at::acc_type<double, true>]" 
(460): here

csrc/welford.cu(49): error: identifier "__shfl_down_sync" is undefined
          detected during:
            instantiation of "T warp_reduce_sum(T) [with T=at::acc_type<float, true>]" 
(60): here
            instantiation of "T reduce_block(T *, T) [with T=at::acc_type<float, true>]" 
(268): here
            instantiation of "void reduce_bn_kernel(const scalar_t *, const scalar_t *, const accscalar_t *, const accscalar_t *, accscalar_t *, accscalar_t *, layerscalar_t *, layerscalar_t *, int, int, int, float) [with scalar_t=float, accscalar_t=at::acc_type<float, true>, layerscalar_t=at::acc_type<float, true>]" 
(460): here

3 errors detected in the compilation of "/tmp/tmpxft_00002f5a_00000000-7_welford.cpp1.ii".
error: command '/usr/local/cuda/bin/nvcc' failed with exit status 2

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
zimenglan-sysu-512commented, Apr 19, 2019

hi @mcarilli how to use some cuda 8-specific workarounds for the shfl ops?

0reactions
mcarillicommented, Jun 19, 2019

Unfortunately we don’t support cuda 8. Even if you fixed the shuffles, there are any number of other things that might break, and I can’t predict what they might be. Do you have a Volta GPU you can run on? If so, it’s pretty easy to try cuda 9+.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Gatsby - Uncaught TypeError - Context/build problem
Console.log in layout.js shows that the value is not undefined. First I wrapped gatsby-browser, but later I was trying to wrap also other ......
Read more >
How to Create Context-Rich Problems - SERC - Carleton
Decide on the goals of the problem. Context-rich problems help students to apply discipline specific knowledge thus moving beyond novice skills of memorization....
Read more >
We have a problem with buildings: Ideas and tech can change ...
The planet has a problem with buildings: Here's how smart ideas, tech and design can change that. Published Fri, May 22 20204:10 AM...
Read more >
11.2. Top Three Causes of Problems Red Hat Enterprise Linux 7
This information is called the SELinux context. If these labels are wrong, access may be denied. An incorrectly labeled application may cause an...
Read more >
tools for debugging the docker build context #3324 - GitHub
A common complaint about Tilt is that people have trouble figuring out what files are in their Docker context, or how those files...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found