Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] Unable to build extension "transformer_inference"

See original GitHub issue

Describe the bug A clear and concise description of what the bug is.

After installing deepspeed, I try to run very basic inference with the transformers library but it seems that there is no way to install the transformer_inference extension necessary for inference. This is even after adding the flag for DS_BUILD_TRANSFORMER_INFERENCE=1 to the install process. I can get around this problem by just running init_inference twice and catching the first error, but this seems wrong.

To Reproduce Steps to reproduce the behavior:

Simple inference script to reproduce
What packages are required and their versions
How to run the script
…

DS_BUILD_OPS=1 DS_BUILD_TRANSFORMER_INFERENCE=1 pip install deepspeed
python -c """
import deepspeed
import transformers
import os
model = transformers.AutoModelForCausalLM.from_pretrained('gpt2')
world_size = int(os.getenv('WORLD_SIZE', '1'))
model = deepspeed.init_inference(
            model,
            mp_size=world_size,
            replace_with_kernel_inject=True,
            replace_method='auto',
        )
"""

Error:

File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 468, in replace_with_policy
    new_module = transformer_inference.DeepSpeedTransformerInference(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 53, in __init__
    inference_cuda_module = builder.load()
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 460, in load
    return self.jit_load(verbose)
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 495, in jit_load
    op_module = load(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 986, in load
    return _jit_compile(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1193, in _jit_compile
    _write_ninja_file_and_build_library(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1297, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1555, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'

Expected behavior A clear and concise description of what you expected to happen.

Expected to see transformer and transformer_inference in the output of the ds_report command.

ds_report output Please run ds_report to give us details about your setup.

--------------------------------------------------
DeepSpeed C++/CUDA extension op report
--------------------------------------------------
NOTE: Ops not installed will be just-in-time (JIT) compiled at
      runtime if needed. Op compatibility means that your system
      meet the required dependencies to JIT install the op.
--------------------------------------------------
JIT compiled ops requires ninja
ninja .................. [OKAY]
--------------------------------------------------
op name ................ installed .. compatible
--------------------------------------------------
cpu_adam ............... [NO] ....... [OKAY]
cpu_adagrad ............ [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
sparse_attn ............ [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
 [WARNING]  async_io requires the dev libaio .so object and headers but these wer
e not found.
 [WARNING]  async_io: please install the libaio-devel package with yum
 [WARNING]  If libaio is already installed (perhaps from source), try setting the
 CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
utils .................. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
fused_adam ............. [NO] ....... [OKAY]
fused_lamb ............. [NO] ....... [OKAY]
sparse_attn ............ [NO] ....... [OKAY]
transformer ............ [NO] ....... [OKAY]
stochastic_transformer . [NO] ....... [OKAY]
 [WARNING]  async_io requires the dev libaio .so object and headers but these wer
e not found.
 [WARNING]  async_io: please install the libaio-devel package with yum
 [WARNING]  If libaio is already installed (perhaps from source), try setting the
 CFLAGS and LDFLAGS environment variables to where it can be found.
async_io ............... [NO] ....... [NO]
utils .................. [NO] ....... [OKAY]
quantizer .............. [NO] ....... [OKAY]
transformer_inference .. [NO] ....... [OKAY]
spatial_inference ...... [NO] ....... [OKAY]
--------------------------------------------------
DeepSpeed general environment info:
torch install path ............... ['~/.conda/envs/py3/lib/python3.8/site-
packages/torch']
torch version .................... 1.7.1+cu110
torch cuda version ............... 11.0
torch hip version ................ None
nvcc version ..................... 11.3
deepspeed install path ........... ['~/.conda/envs/py3/lib/python3.8/site-
packages/deepspeed']
deepspeed info ................... 0.7.7, unknown, unknown
deepspeed wheel compiled w. ...... torch 1.7, cuda 11.0

Screenshots If applicable, add screenshots to help explain your problem.

System info (please complete the following information):

OS: Ubuntu
GPU count and types: A100 machine with 8 GPUs
(if applicable) what DeepSpeed-MII version are you using: 0.7.7
(if applicable) Hugging Face Transformers/Accelerate/etc. versions: 4.24.0
Python version: 3.8.15
Any other relevant info about your setup

Docker context Are you using a specific docker image that you can share?

Additional context Add any other context about the problem here.

Issue Analytics

State:
Created 9 months ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

RezaYazdaniAminabadicommented, Dec 15, 2022

Hi @ianbstewart

I just checked on my side, and I see that the compiler chosen by torch is different between us. I am seeing nvcc for the CUDA files such as gelu.cu, however I see c++ for the cpp files like pt_biniding.cpp:

[6/9] /usr/local/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/reyazda/DeepSpeed/deepspeed/ops/csrc/transformer/inference/includes -I/home/reyazda/DeepSpeed/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_70,code=compute_70 -gencode=arch=compute_70,code=sm_70 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_70,code=compute_70 -c /home/reyazda/DeepSpeed/deepspeed/ops/csrc/transformer/inference/csrc/gelu.cu -o gelu.cuda.o 

[8/9] c++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/home/reyazda/DeepSpeed/deepspeed/ops/csrc/transformer/inference/includes -I/home/reyazda/DeepSpeed/deepspeed/ops/csrc/includes -isystem /opt/conda/lib/python3.8/site-packages/torch/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem /opt/conda/lib/python3.8/site-packages/torch/include/TH -isystem /opt/conda/lib/python3.8/site-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /opt/conda/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -c /home/reyazda/DeepSpeed/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o

I am not sure how this is set as nvcc++ on your side (/share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc++)! There must be some way to change this to point to the right compiler for the cpp files at torch.

@jeffra, do you have any idea how to resolve this issue?

Thanks, Reza

0reactions

ianbstewartcommented, Dec 14, 2022

See below:

~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py:266: UserWarning: 

                               !! WARNING !!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Your compiler (/share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc++) is not compatible with the compiler Pytorch was
built with for this platform, which is g++ on linux. Please
use g++ to to compile your extension. Alternatively, you may
compile PyTorch from source using /share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc++, and then you can also use
/share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc++ to compile your extension.

See https://github.com/pytorch/pytorch/blob/master/CONTRIBUTING.md for help
with compiling PyTorch from source.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

                              !! WARNING !!

  warnings.warn(WRONG_COMPILER_WARNING.format(
Traceback (most recent call last):
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1533, in _run_ninja_build
    subprocess.run(
  File "~/.conda/envs/py3/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "debug_deepspeed_inference.py", line 6, in <module>
    model = deepspeed.init_inference(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/__init__.py", line 311, in init_inference
    engine = InferenceEngine(model, config=ds_inference_config)
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 124, in __init__
    self._apply_injection_policy(config)
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/inference/engine.py", line 349, in _apply_injection_policy
    replace_transformer_layer(client_module,
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 899, in replace_transformer_layer
    replaced_module = replace_module(model=model,
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 1168, in replace_module
    replaced_module, _ = _replace_module(model, policy)
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 1195, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 1195, in _replace_module
    _, layer_id = _replace_module(child, policies, layer_id=layer_id)
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 1185, in _replace_module
    replaced_module = policies[child.__class__][0](child,
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 889, in replace_fn
    new_module = replace_with_policy(child,
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/module_inject/replace_module.py", line 468, in replace_with_policy
    new_module = transformer_inference.DeepSpeedTransformerInference(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/model_implementations/transformers/ds_transformer.py", line 53, in __init__
    inference_cuda_module = builder.load()
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 460, in load
    return self.jit_load(verbose)
  File "~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/op_builder/builder.py", line 495, in jit_load
    op_module = load(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 986, in load
    return _jit_compile(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1193, in _jit_compile
    _write_ninja_file_and_build_library(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1297, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "~/.conda/envs/py3/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1555, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'transformer_inference'

[2022-12-14 14:08:34,996] [WARNING] [runner.py:179:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
Detected CUDA_VISIBLE_DEVICES=0,2 but ignoring it because one or several of --include/--exclude/--num_gpus/--num_nodes cl args were used. If you want to use CUDA_VISIBLE_DEVICES don't pass any of these arguments to deepspeed.
[2022-12-14 14:08:35,277] [INFO] [runner.py:508:main] cmd = ~/.conda/envs/py3/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMCwgMV19 --master_addr=127.0.0.1 --master_port=29500 debug_deepspeed_inference.py
[2022-12-14 14:08:36,468] [INFO] [launch.py:142:main] WORLD INFO DICT: {'localhost': [0, 1]}
[2022-12-14 14:08:36,469] [INFO] [launch.py:148:main] nnodes=1, num_local_procs=2, node_rank=0
[2022-12-14 14:08:36,469] [INFO] [launch.py:161:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0, 1]})
[2022-12-14 14:08:36,469] [INFO] [launch.py:162:main] dist_world_size=2
[2022-12-14 14:08:36,469] [INFO] [launch.py:164:main] Setting CUDA_VISIBLE_DEVICES=0,1
[2022-12-14 14:08:43,175] [INFO] [logging.py:68:log_dist] [Rank -1] DeepSpeed info: version=0.7.7, git-hash=unknown, git-branch=unknown
[2022-12-14 14:08:43,176] [WARNING] [config_utils.py:67:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2022-12-14 14:08:43,176] [INFO] [logging.py:68:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
[2022-12-14 14:08:43,189] [INFO] [logging.py:68:log_dist] [Rank -1] DeepSpeed info: version=0.7.7, git-hash=unknown, git-branch=unknown
[2022-12-14 14:08:43,189] [WARNING] [config_utils.py:67:_process_deprecated_field] Config parameter mp_size is deprecated use tensor_parallel.tp_size instead
[2022-12-14 14:08:43,190] [INFO] [logging.py:68:log_dist] [Rank -1] quantize_bits = 8 mlp_extra_grouping = False, quantize_groups = 1
[2022-12-14 14:08:43,365] [INFO] [comm.py:654:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
Installed CUDA version 11.3 does not match the version torch was compiled with 11.0 but since the APIs are compatible, accepting this combination
Installed CUDA version 11.3 does not match the version torch was compiled with 11.0 but since the APIs are compatible, accepting this combination
Using ~/.cache/torch_extensions as PyTorch extensions root...
Using ~/.cache/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file ~/.cache/torch_extensions/transformer_inference/build.ninja...
Building extension module transformer_inference...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/9] /share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/TH -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/THC -isystem /share/apps/cuda/11.3/include -isystem ~/.conda/envs/py3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -c ~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o 
FAILED: pt_binding.o 
/share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc++ -MMD -MF pt_binding.o.d -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/TH -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/THC -isystem /share/apps/cuda/11.3/include -isystem ~/.conda/envs/py3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -O3 -std=c++14 -g -Wno-reorder -c ~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/pt_binding.cpp -o pt_binding.o 
nvc++-Error-Unknown switch: -Wno-reorder
[2/9] /share/apps/cuda/11.3/bin/nvcc -ccbin /share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/TH -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/THC -isystem /share/apps/cuda/11.3/include -isystem ~/.conda/envs/py3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_80,code=sm_80 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -c ~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/gelu.cu -o gelu.cuda.o 
FAILED: gelu.cuda.o 
/share/apps/cuda/11.3/bin/nvcc -ccbin /share/apps/nvhpc/22.3/Linux_x86_64/22.3/compilers/bin/nvc -DTORCH_EXTENSION_NAME=transformer_inference -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/includes -I~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/includes -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/TH -isystem ~/.conda/envs/py3/lib/python3.8/site-packages/torch/include/THC -isystem /share/apps/cuda/11.3/include -isystem ~/.conda/envs/py3/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_80,code=sm_80 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++14 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_80,code=sm_80 -gencode=arch=compute_80,code=compute_80 -c ~/.conda/envs/py3/lib/python3.8/site-packages/deepspeed/ops/csrc/transformer/inference/csrc/gelu.cu -o gelu.cuda.o 
nvcc fatal   : Unsupported PGI compiler found.  pgc++ is the only PGI compiler that is supported.
ninja: build stopped: subcommand failed.
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.6002025604248047 seconds
Installed CUDA version 11.3 does not match the version torch was compiled with 11.0 but since the APIs are compatible, accepting this combination
Using ~/.cache/torch_extensions as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.06150174140930176 seconds
Installed CUDA version 11.3 does not match the version torch was compiled with 11.0 but since the APIs are compatible, accepting this combination
Using ~/.cache/torch_extensions as PyTorch extensions root...
No modifications detected for re-loaded extension module transformer_inference, skipping build step...
Loading extension module transformer_inference...
Time to load transformer_inference op: 0.06500792503356934 seconds
[2022-12-14 14:08:52,499] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 149933
[2022-12-14 14:08:52,499] [INFO] [launch.py:318:sigkill_handler] Killing subprocess 149934
[2022-12-14 14:08:52,742] [ERROR] [launch.py:324:sigkill_handler] ['~/.conda/envs/py3/bin/python', '-u', 'debug_deepspeed_inference.py', '--local_rank=1'] exits with return code = 1

Top Results From Across the Web

Troubleshoot - Hugging Face

Troubleshoot. Sometimes errors occur, but we are here to help! This guide covers some of the most common issues we've seen and how...

TensorFlow Lite inference

The term inference refers to the process of executing a TensorFlow Lite model on-device in order to make predictions based on input data....

Custom C++ and CUDA Extensions - PyTorch

And that's all we really need to know about building C++ extensions for now! ... hit internal compiler error while parsing torch/extension.h on...

NVIDIA Deep Learning TensorRT Documentation

Added a link to the new Optimizing Builder Performance section from The Build Phase section. August 26, 2022, Rewrote Performing Inference for the...

Use Batch Transform - Amazon SageMaker

The processed files still generate useable results. Exceeding the MaxPayloadInMB limit causes an error. This might happen with a large dataset if it...