Segfault when building kernel
See original GitHub issueAs requested by @stas00, I’m opening an issues that I’m experiencing with HuggingFace’s Transformers library here. When running the following script: https://github.com/huggingface/transformers/blob/master/examples/research_projects/wav2vec2/README.md#pretraining-wav2vec2
I’m getting a segfault error when building the kernels:
[1/3] /usr/bin/nvcc --generate-dependencies-with-compile --dependency-output custom_cuda_kernel.cuda.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/patrick/anaconda3/envs/hu
gging_face/lib/python3.9/site-packages/deepspeed/ops/csrc/includes -I/usr/include -isystem /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include -isystem /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/
patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include/TH -isystem /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include/THC -isystem /home/patrick/anaconda3/envs/hugging_face/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D_
_CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERS
IONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 -c /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/deepspeed/ops/csrc/adam/custom_cuda_kernel.cu -o custom_cuda_kernel.cuda.o
FAILED: custom_cuda_kernel.cuda.o
/usr/bin/nvcc --generate-dependencies-with-compile --dependency-output custom_cuda_kernel.cuda.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/patrick/anaconda3/envs/hugging_
face/lib/python3.9/site-packages/deepspeed/ops/csrc/includes -I/usr/include -isystem /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include -isystem /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include/torch/csrc/api/include -isystem /home/patric
k/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include/TH -isystem /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/torch/include/THC -isystem /home/patrick/anaconda3/envs/hugging_face/include/python3.9 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_
NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=compute_75 -gencode=arch=compute_75,code=sm_75 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__
-U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_75,code=compute_75 -c /home/patrick/anaconda3/envs/hugging_face/lib/python3.9/site-packages/deepspeed/ops/csrc/adam/custom_cuda_kernel.cu -o custom_cuda_kernel.cuda.o
/usr/include/c++/10/chrono: In substitution of ‘template<class _Rep, class _Period> template<class _Period2> using __is_harmonic = std::__bool_constant<(std::ratio<((_Period2::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)) * (_Period::den / std::chrono::duration<_Rep, _Pe
riod>::_S_gcd(_Period2::den, _Period::den))), ((_Period2::den / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::den, _Period::den)) * (_Period::num / std::chrono::duration<_Rep, _Period>::_S_gcd(_Period2::num, _Period::num)))>::den == 1)> [with _Period2 = _Period2; _Rep = _Rep; _Period = _Period]’:
/usr/include/c++/10/chrono:473:154: required from here
/usr/include/c++/10/chrono:428:27: internal compiler error: Segmentation fault
428 | _S_gcd(intmax_t __m, intmax_t __n) noexcept
My environment is:
- `transformers` version: 4.10.0.dev0
- Platform: Linux-5.11.0-25-generic-x86_64-with-glibc2.33
- Python version: 3.9.1
- PyTorch version (GPU?): 1.9.0.dev20210217 (True)
- Using GPU in script?: yes
- Using distributed or parallel set-up in script?: yes
- Deepspeed: 0.4.4
- CUDA Version: 11.2
- GPU: 4 x TITAN RTX
To reproduce:
- Clone the transformers repo:
git clone https://github.com/huggingface/transformers
- Install all packages
cd transformers && pip install -e ".[dev]"
- Go inside the wav2vec2 research folder
cd examples/research_projects/wav2vec2
and run the following command:
PYTHONPATH=../../../src deepspeed --num_gpus 4 run_pretrain.py \
--output_dir="./wav2vec2-base-libri-100h" \
--num_train_epochs="3" \
--per_device_train_batch_size="32" \
--per_device_eval_batch_size="32" \
--gradient_accumulation_steps="2" \
--save_total_limit="3" \
--save_steps="500" \
--logging_steps="10" \
--learning_rate="5e-4" \
--weight_decay="0.01" \
--warmup_steps="3000" \
--model_name_or_path="patrickvonplaten/wav2vec2-base-libri-100h" \
--dataset_name="librispeech_asr" \
--dataset_config_name="clean" \
--train_split_name="train.100" \
--preprocessing_num_workers="4" \
--max_duration_in_seconds="10.0" \
--group_by_length \
--verbose_logging \
--fp16 \
--deepspeed ds_config_wav2vec2_zero2.json \
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (7 by maintainers)
Top Results From Across the Web
Segmentation fault when compiling kernel/modules
to Android Building. Hi All, I am trying to cross-compile kernel/modules of gingerbread for arm on 64-bit Ubuntu (11.04). I am getting segmentation...
Read more >Re: Compiler segfault when building the kernel
Re: Compiler segfault when building the kernel ... I've been building kernels (vanilla from upstream) for years with > kernel-package ...
Read more >Segmentation Fault when trying to compile kernel 4.16.3-300 ...
Description of problem: when trying to compile the kernel I have a segmentation fault, and a request to submit a bug :) Version-Release ......
Read more >Re: Segfault in pahole 1.18 when building kernel 5.9.1 for arm64
Re: Segfault in pahole 1.18 when building kernel 5.9.1 for arm64 [not found] ... However, pahole > > > version 1.18 segfaults during...
Read more >Kernel Segfaults for Fun (but no profit) - Stephen Brennan
In “episode 2” of my kernel development series, I'm going to talk about how I put Python into an uninterruptible sleep. This spooky...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
cc: @RezaYazdaniAminabadi
FWIW, I am not able to reproduce this on my machine. It works just fine for me, but on py38.
@patrickvonplaten, I haven’t noticed in the first place, but I see:
Any chance you could update to an official
pt-1.9.0
and re-test? Yours is a nightly build and about 2 weeks before 1.9.0 was released. Is it possible there was some issue in it? Just to ensure we are testing the same things.Thank you for identifying the source of the problem, Reza!
Patrick, it appears that you got hit by being-on-the-cutting-edge software. I’m on gcc 9.3 still and it doesn’t have this problem.