Error in building Transformer kernel
See original GitHub issueI am using deepspeed/deepspeed:latest
container (I tried to install Deepspeed with DS_BUILD_OPS=1 pip install deepspeed
but I got the same error) and trying to use the Transformer kernel provided by DeepSpeed as follows:
from deepspeed import DeepSpeedTransformerLayer, DeepSpeedTransformerConfig
if __name__ == "__main__":
transformer_config = DeepSpeedTransformerConfig(
batch_size=40,
hidden_size=768,
heads=768 // 64,
intermediate_size=768 * 4,
attn_dropout_ratio=0.0,
hidden_dropout_ratio=0.0,
num_hidden_layers=4,
initializer_range=0.02,
fp16=True,
pre_layer_norm=True,
stochastic_mode=True,
)
layer = DeepSpeedTransformerLayer(config=transformer_config)
But I can’t initialize the layer with the following error
DeepSpeed Transformer config is {'layer_id': 0, 'batch_size': 40, 'hidden_size': 768, 'intermediate_size': 3072, 'heads': 12, 'attn_dropout_ratio': 0.0, 'hidden_dropout_ratio': 0.0, 'num_hidden_layers': 4, 'initializer_range': 0.02, 'fp16': True, 'pre_layer_norm': True, 'local_rank': -1, 'seed': -1, 'normalize_invertible': False, 'gelu_checkpoint': False, 'adjust_init_range': True, 'test_gemm': False, 'training': True, 'is_grad_enabled': True, 'attn_dropout_checkpoint': False, 'stochastic_mode': True, 'huggingface': False}
Using /root/.cache/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/stochastic_transformer/build.ninja...
Building extension module stochastic_transformer...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/cublas_wrappers.cu -o cublas_wrappers.cuda.o
[2/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu -o dropout_kernels.cuda.o
FAILED: dropout_kernels.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu -o dropout_kernels.cuda.o
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(102): error: no operator "*" matches these operands
operand types are: __half2 * const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(103): error: no operator "*" matches these operands
operand types are: __half2 * const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(216): error: no operator "*" matches these operands
operand types are: __half2 * const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(217): error: no operator "*" matches these operands
operand types are: __half2 * const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(335): error: no operator "*" matches these operands
operand types are: __half2 * const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu(336): error: no operator "*" matches these operands
operand types are: __half2 * const __half2
6 errors detected in the compilation of "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/dropout_kernels.cu".
[3/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu -o normalize_kernels.cuda.o
FAILED: normalize_kernels.cuda.o
/usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu -o normalize_kernels.cuda.o
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(880): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(883): error: no operator "-" matches these operands
operand types are: const __half2 - const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(885): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(890): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(892): error: no operator "-" matches these operands
operand types are: const __half2 - const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(893): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(901): error: no operator "*" matches these operands
operand types are: __half2 * __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(901): error: identifier "h2sqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(905): error: identifier "h2rsqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(927): error: no operator "-" matches these operands
operand types are: - __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1189): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1194): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1205): error: no operator "-" matches these operands
operand types are: const __half2 - __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1206): error: no operator "*" matches these operands
operand types are: __half2 * __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1210): error: identifier "h2rsqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1232): error: no operator "-" matches these operands
operand types are: - __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1232): error: identifier "h2rsqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1621): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1624): error: no operator "-" matches these operands
operand types are: const __half2 - const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1626): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1631): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1633): error: no operator "-" matches these operands
operand types are: const __half2 - const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1634): error: ambiguous "?" operation: second operand of type "<error-type>" can be converted to third operand type "const __half2", and vice versa
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1642): error: no operator "*" matches these operands
operand types are: __half2 * __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1642): error: identifier "h2sqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1646): error: identifier "h2rsqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1668): error: no operator "-" matches these operands
operand types are: - __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1703): error: no operator "+" matches these operands
operand types are: __half2 + const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1710): error: no operator "+" matches these operands
operand types are: __half2 + const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1940): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1946): error: no operator "*=" matches these operands
operand types are: __half2 *= __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1959): error: no operator "-" matches these operands
operand types are: __half2 - __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1960): error: no operator "*" matches these operands
operand types are: __half2 * __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1964): error: identifier "h2rsqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1986): error: no operator "-" matches these operands
operand types are: - __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(1986): error: identifier "h2rsqrt" is undefined
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(2021): error: no operator "+" matches these operands
operand types are: __half2 + const __half2
/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu(2027): error: no operator "+" matches these operands
operand types are: __half2 + const __half2
38 errors detected in the compilation of "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/normalize_kernels.cu".
[4/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/general_kernels.cu -o general_kernels.cuda.o
[5/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/transform_kernels.cu -o transform_kernels.cuda.o
[6/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/gelu_kernels.cu -o gelu_kernels.cuda.o
[7/8] /usr/local/cuda/bin/nvcc -DTORCH_EXTENSION_NAME=stochastic_transformer -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1013\" -I/opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_61,code=sm_61 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -D__STOCHASTIC_MODE__ -c /opt/conda/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/softmax_kernels.cu -o softmax_kernels.cuda.o
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1549, in _run_ninja_build
subprocess.run(
File "/opt/conda/lib/python3.8/subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "experimentation.py", line 17, in <module>
layer = DeepSpeedTransformerLayer(config=transformer_config)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/transformer/transformer.py", line 543, in __init__
stochastic_transformer_cuda_module = StochasticTransformerBuilder().load()
File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 180, in load
return self.jit_load(verbose)
File "/opt/conda/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 208, in jit_load
op_module = load(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 999, in load
return _jit_compile(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1204, in _jit_compile
_write_ninja_file_and_build_library(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1308, in _write_ninja_file_and_build_library
_run_ninja_build(
File "/opt/conda/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1565, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error building extension 'stochastic_transformer'
Issue Analytics
- State:
- Created 3 years ago
- Comments:28 (13 by maintainers)
Top Results From Across the Web
Troubleshoot - Hugging Face
Troubleshoot. Sometimes errors occur, but we are here to help! This guide covers some of the most common issues we've seen and how...
Read more >Learning the Transformer Kernel - OpenReview
In this work we introduce KL-TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers.
Read more >DeepSpeed Transformer Kernel
This tutorial shows how to enable the DeepSpeed transformer kernel and set its different configuration parameters.
Read more >jupyter notebook's kernel keeps dying when I run the code
I tried almost everything and I always get this error. The kernel appears to have died. It will restart automatically.
Read more >PI40766: TRANSFORMER FAILS TO BUILD CUBE ON 64-BIT ... - IBM
Transformer fails to build a PowerCube on 64-bit Linux operating systems with the error:?TR3101: Transformer couldn't create the.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
root@x8a100-0000:/workspace# env | grep -i arch TORCH_CUDA_ARCH_LIST=5.2 6.0 6.1 7.0 7.5 8.0 8.6+PTX
export TORCH_CUDA_ARCH_LIST=7.0 DS_BUILD_OPS=1 pip3 install deepspeed
Worked, thank you.
unset TORCH_CUDA_ARCH_LIST
fixed the problem for me.