question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error in building Transformer kernel

See original GitHub issue

I am using deepspeed/deepspeed:latest container (I tried to install Deepspeed with DS_BUILD_OPS=1 pip install deepspeed but I got the same error) and trying to use the Transformer kernel provided by DeepSpeed as follows:

from deepspeed import DeepSpeedTransformerLayer, DeepSpeedTransformerConfig

if __name__ == "__main__":
    transformer_config = DeepSpeedTransformerConfig(
        batch_size=40,
        hidden_size=768,
        heads=768 // 64,
        intermediate_size=768 * 4,
        attn_dropout_ratio=0.0,
        hidden_dropout_ratio=0.0,
        num_hidden_layers=4,
        initializer_range=0.02,
        fp16=True,
        pre_layer_norm=True,
        stochastic_mode=True,
    )
    layer = DeepSpeedTransformerLayer(config=transformer_config)

But I can’t initialize the layer with the following error

DeepSpeed Transformer config is  {'layer_id': 0, 'batch_size': 40, 'hidden_size': 768, 'intermediate_size': 3072, 'heads': 12, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 4, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': True, 'huggingface': False}
Using /root/.cache/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/stochastic_transformer/build.ninja...
Building extension module stochastic_transformer...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/cublas_wrappers.cu -o cublas_wrappers.cuda.o
[2/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu -o dropout_kernels.cuda.o
FAILED: dropout_kernels.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu -o dropout_kernels.cuda.o
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(102): error: no operator "*" matches these operands
            operand types are: __half2 * const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(103): error: no operator "*" matches these operands
            operand types are: __half2 * const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(216): error: no operator "*" matches these operands
            operand types are: __half2 * const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(217): error: no operator "*" matches these operands
            operand types are: __half2 * const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(335): error: no operator "*" matches these operands
            operand types are: __half2 * const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(336): error: no operator "*" matches these operands
            operand types are: __half2 * const __half2

6 errors detected in the compilation of "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu".
[3/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu -o normalize_kernels.cuda.o
FAILED: normalize_kernels.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu -o normalize_kernels.cuda.o
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(880): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(883): error: no operator "-" matches these operands
            operand types are: const __half2 - const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(885): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(890): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(892): error: no operator "-" matches these operands
            operand types are: const __half2 - const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(893): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(901): error: no operator "*" matches these operands
            operand types are: __half2 * __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(901): error: identifier "h2sqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(905): error: identifier "h2rsqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(927): error: no operator "-" matches these operands
            operand types are: - __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1189): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1194): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1205): error: no operator "-" matches these operands
            operand types are: const __half2 - __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1206): error: no operator "*" matches these operands
            operand types are: __half2 * __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1210): error: identifier "h2rsqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1232): error: no operator "-" matches these operands
            operand types are: - __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1232): error: identifier "h2rsqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1621): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1624): error: no operator "-" matches these operands
            operand types are: const __half2 - const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1626): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1631): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1633): error: no operator "-" matches these operands
            operand types are: const __half2 - const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1634): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1642): error: no operator "*" matches these operands
            operand types are: __half2 * __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1642): error: identifier "h2sqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1646): error: identifier "h2rsqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1668): error: no operator "-" matches these operands
            operand types are: - __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1703): error: no operator "+" matches these operands
            operand types are: __half2 + const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1710): error: no operator "+" matches these operands
            operand types are: __half2 + const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1940): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1946): error: no operator "*=" matches these operands
            operand types are: __half2 *= __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1959): error: no operator "-" matches these operands
            operand types are: __half2 - __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1960): error: no operator "*" matches these operands
            operand types are: __half2 * __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1964): error: identifier "h2rsqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1986): error: no operator "-" matches these operands
            operand types are: - __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1986): error: identifier "h2rsqrt" is undefined

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(2021): error: no operator "+" matches these operands
            operand types are: __half2 + const __half2

/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(2027): error: no operator "+" matches these operands
            operand types are: __half2 + const __half2

38 errors detected in the compilation of "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu".
[4/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/general_kernels.cu -o general_kernels.cuda.o
[5/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/transform_kernels.cu -o transform_kernels.cuda.o
[6/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/gelu_kernels.cu -o gelu_kernels.cuda.o
[7/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/softmax_kernels.cu -o softmax_kernels.cuda.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1549, in _run_ninja_build
    subprocess.run(
  File "/opt/conda/lib/python3.8/subprocess.py", line 512, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "experimentation.py", line 17, in <module>
    layer = DeepSpeedTransformerLayer(config=transformer_config)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/transformer/transformer.py", line 543, in __init__
    stochastic_transformer_cuda_module = StochasticTransformerBuilder().load()
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 180, in load
    return self.jit_load(verbose)
  File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 208, in jit_load
    op_module = load(
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 999, in load
    return _jit_compile(
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1204, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1308, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1565, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'stochastic_transformer'

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:28 (13 by maintainers)

github_iconTop GitHub Comments

9reactions
garvctcommented, Jun 10, 2021

root@x8a100-0000:/workspace# env | grep -i arch TORCH_CUDA_ARCH_LIST=5.2 6.0 6.1 7.0 7.5 8.0 8.6+PTX

export TORCH_CUDA_ARCH_LIST=7.0 DS_BUILD_OPS=1 pip3 install deepspeed

Worked, thank you.

2reactions
qysnncommented, Mar 8, 2022

unset TORCH_CUDA_ARCH_LIST fixed the problem for me.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Troubleshoot - Hugging Face
Troubleshoot. Sometimes errors occur, but we are here to help! This guide covers some of the most common issues we've seen and how...
Read more >
Learning the Transformer Kernel - OpenReview
In this work we introduce KL-TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers.
Read more >
DeepSpeed Transformer Kernel
This tutorial shows how to enable the DeepSpeed transformer kernel and set its different configuration parameters.
Read more >
jupyter notebook's kernel keeps dying when I run the code
I tried almost everything and I always get this error. The kernel appears to have died. It will restart automatically.
Read more >
PI40766: TRANSFORMER FAILS TO BUILD CUBE ON 64-BIT ... - IBM
Transformer fails to build a PowerCube on 64-bit Linux operating systems with the error:?TR3101: Transformer couldn't create the.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found