python backend with custom packages reports error "Internal: Failed to initialize stub, stub process exited unexpectedly"
See original GitHub issueDescription
Error when stub
and conda-pack
is given to the server Internal: Failed to initialize stub, stub process exited unexpectedly
Triton Information triton version 21.06.1 with container
To Reproduce I have followed this to create model
Commands to reproduce
# install packages to use in python_backend
$ conda create -n gpt2 python=3.8
$ conda activate gpt2
$ conda install numpy
$ pip install transformers tokenizers torch conda-pack
# build stub
$ git clone https://github.com/triton-inference-server/python_backend -b r21.06
$ cd python_backend
$ mkdir build && cd build
$ cmake -DTRITON_ENABLE_GPU=ON -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install ..
$ make triton-python-backend-stub
# can confirm libpython is linked in sub and copy it to model repository
$ ldd triton_python_backend_stub | grep python
libpython3.8.so.1.0 => /home/ubuntu/miniconda3/envs/gpt2/lib/libpython3.8.so.1.0 (0x00007f97bdbb0000)
$ cp triton_python_backend_stub /path/to/model_repository/gpt2
# copy conda pack to model repository
$ conda-pack
$ cp gpt2.tar.gz /path/to/model_repository/gpt2
Directory structure of model repository
model_repository
└── gpt2
├── 1
│ └── model.py
└── config.pbtxt
└── triton_python_backend_stub
└── gpt2.tar.gz
model_registry/gpt2/1/model.py
import numpy as np
import triton_python_backend_utils as pb_utils
from transformers import GPT2LMHeadModel, GPT2Tokenizer
class TritonPythonModel:
def initialize(self, args):
self.model = GPT2LMHeadModel.from_pretrained(
'gpt2',
max_length = 128,
repetition_penalty = 2.0
)
self.tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
def execute(self, requests):
responses = []
for request in requests:
in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
in_0 = in_0.as_numpy()[0].decode("utf-8")
input_ids = self.tokenizer(in_0, return_tensors='pt').input_ids
outputs = self.model.generate(
input_ids,
pad_token_id=self.tokenizer.pad_token_id,
eos_token_id=self.tokenizer.eos_token_id,
bos_token_id=self.tokenizer.bos_token_id,
use_cache=True)
out_0 = self.tokenizer.decode(outputs[0], skip_special_tokens=True)
out_0 = pb_utils.Tensor("OUTPUT0",
np.array([out_0], dtype=object))
inference_response = pb_utils.InferenceResponse(
output_tensors=[out_tensor_0])
responses.append(inference_response)
return responses
def finalize(self):
print('Cleaning up...')
model_registry/gpt2/config.pbtxt
backend: "python"
max_batch_size: 64
input [
{
name: "INPUT0"
data_type: TYPE_STRING
dims: [ -1 ]
}
]
output [
{
name: "OUTPUT0"
data_type: TYPE_STRING
dims: [ -1 ]
}
]
dynamic_batching {
preferred_batch_size: [ 1,2,4,8,16,32,64 ]
max_queue_delay_microseconds: 30000
}
instance_group [{ count: 1, kind: KIND_GPU }]
parameters: {
key: "EXECUTION_ENV_PATH",
value: {string_value: "/models/gpt2/gpt2.tar.gz"}
}
Then run docker containe
$ docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /path/to/model_repository:/models nvcr.io/nvidia/tritonserver:21.06.1-py3 tritonserver --model-repository=/models --log-verbose 10
=============================
== Triton Inference Server ==
=============================
NVIDIA Release 21.06 (build 24449615)
Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license
NOTE: Legacy NVIDIA Driver detected. Compatibility mode ENABLED.
I0708 02:45:43.242384 1 metrics.cc:291] Collecting metrics for GPU 0: Tesla V100-PCIE-16GB
I0708 02:45:43.242771 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/pytorch/libtriton_pytorch.so
I0708 02:45:43.578167 1 libtorch.cc:987] TRITONBACKEND_Initialize: pytorch
I0708 02:45:43.578216 1 libtorch.cc:997] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.578222 1 libtorch.cc:1003] 'pytorch' TRITONBACKEND API version: 1.4
I0708 02:45:43.578278 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so
2021-07-08 02:45:43.771931: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0708 02:45:43.815189 1 tensorflow.cc:2165] TRITONBACKEND_Initialize: tensorflow
I0708 02:45:43.815223 1 tensorflow.cc:2175] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.815229 1 tensorflow.cc:2181] 'tensorflow' TRITONBACKEND API version: 1.4
I0708 02:45:43.815234 1 tensorflow.cc:2205] backend configuration:
{}
I0708 02:45:43.815296 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so
I0708 02:45:43.816660 1 onnxruntime.cc:1969] TRITONBACKEND_Initialize: onnxruntime
I0708 02:45:43.816687 1 onnxruntime.cc:1979] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.816692 1 onnxruntime.cc:1985] 'onnxruntime' TRITONBACKEND API version: 1.4
I0708 02:45:43.825961 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/openvino/libtriton_openvino.so
I0708 02:45:43.834638 1 openvino.cc:1188] TRITONBACKEND_Initialize: openvino
I0708 02:45:43.834660 1 openvino.cc:1198] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.834666 1 openvino.cc:1204] 'openvino' TRITONBACKEND API version: 1.4
I0708 02:45:44.262809 1 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f8694000000' with size 268435456
I0708 02:45:44.263337 1 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0708 02:45:44.264080 1 backend_factory.h:45] Create TritonBackendFactory
I0708 02:45:44.264106 1 plan_backend_factory.cc:49] Create PlanBackendFactory
I0708 02:45:44.264111 1 plan_backend_factory.cc:56] Registering TensorRT Plugins
I0708 02:45:44.264149 1 logging.cc:52] Registered plugin creator - ::BatchTilePlugin_TRT version 1
I0708 02:45:44.264168 1 logging.cc:52] Registered plugin creator - ::BatchedNMS_TRT version 1
I0708 02:45:44.264186 1 logging.cc:52] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
I0708 02:45:44.264194 1 logging.cc:52] Registered plugin creator - ::CoordConvAC version 1
I0708 02:45:44.264202 1 logging.cc:52] Registered plugin creator - ::CropAndResize version 1
I0708 02:45:44.264212 1 logging.cc:52] Registered plugin creator - ::CropAndResizeDynamic version 1
I0708 02:45:44.264220 1 logging.cc:52] Registered plugin creator - ::DetectionLayer_TRT version 1
I0708 02:45:44.264237 1 logging.cc:52] Registered plugin creator - ::FlattenConcat_TRT version 1
I0708 02:45:44.264244 1 logging.cc:52] Registered plugin creator - ::GenerateDetection_TRT version 1
I0708 02:45:44.264254 1 logging.cc:52] Registered plugin creator - ::GridAnchor_TRT version 1
I0708 02:45:44.264273 1 logging.cc:52] Registered plugin creator - ::GridAnchorRect_TRT version 1
I0708 02:45:44.264282 1 logging.cc:52] Registered plugin creator - ::InstanceNormalization_TRT version 1
I0708 02:45:44.264290 1 logging.cc:52] Registered plugin creator - ::LReLU_TRT version 1
I0708 02:45:44.264303 1 logging.cc:52] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
I0708 02:45:44.264319 1 logging.cc:52] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
I0708 02:45:44.264333 1 logging.cc:52] Registered plugin creator - ::NMS_TRT version 1
I0708 02:45:44.264341 1 logging.cc:52] Registered plugin creator - ::NMSDynamic_TRT version 1
I0708 02:45:44.264348 1 logging.cc:52] Registered plugin creator - ::Normalize_TRT version 1
I0708 02:45:44.264357 1 logging.cc:52] Registered plugin creator - ::PriorBox_TRT version 1
I0708 02:45:44.264369 1 logging.cc:52] Registered plugin creator - ::ProposalLayer_TRT version 1
I0708 02:45:44.264378 1 logging.cc:52] Registered plugin creator - ::Proposal version 1
I0708 02:45:44.264388 1 logging.cc:52] Registered plugin creator - ::ProposalDynamic version 1
I0708 02:45:44.264396 1 logging.cc:52] Registered plugin creator - ::PyramidROIAlign_TRT version 1
I0708 02:45:44.264407 1 logging.cc:52] Registered plugin creator - ::Region_TRT version 1
I0708 02:45:44.264415 1 logging.cc:52] Registered plugin creator - ::Reorg_TRT version 1
I0708 02:45:44.264422 1 logging.cc:52] Registered plugin creator - ::ResizeNearest_TRT version 1
I0708 02:45:44.264434 1 logging.cc:52] Registered plugin creator - ::RPROI_TRT version 1
I0708 02:45:44.264441 1 logging.cc:52] Registered plugin creator - ::SpecialSlice_TRT version 1
I0708 02:45:44.264447 1 logging.cc:52] Registered plugin creator - ::Split version 1
I0708 02:45:44.264460 1 ensemble_backend_factory.cc:47] Create EnsembleBackendFactory
I0708 02:45:44.265837 1 model_repository_manager.cc:749] AsyncLoad() 'gpt2'
I0708 02:45:44.265903 1 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 1
I0708 02:45:44.265917 1 model_repository_manager.cc:1026] Load() 'gpt2' version 1
I0708 02:45:44.265921 1 model_repository_manager.cc:1045] loading: gpt2:1
I0708 02:45:44.366670 1 model_repository_manager.cc:1105] CreateInferenceBackend() 'gpt2' version 1
I0708 02:45:44.366793 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/python/libtriton_python.so
I0708 02:45:44.369495 1 python.cc:1298] 'python' TRITONBACKEND API version: 1.4
I0708 02:45:44.369516 1 python.cc:1320] backend configuration:
{}
I0708 02:45:44.369527 1 python.cc:1397] shm-default-byte-size=67108864,shm-growth-byte-size=67108864,stub-timeout-seconds=30
I0708 02:45:44.369966 1 python.cc:1445] TRITONBACKEND_ModelInitialize: gpt2 (version 1)
I0708 02:45:44.370963 1 model_config_utils.cc:1521] ModelConfig 64-bit fields:
I0708 02:45:44.370981 1 model_config_utils.cc:1523] ModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds
I0708 02:45:44.370985 1 model_config_utils.cc:1523] ModelConfig::dynamic_batching::max_queue_delay_microseconds
I0708 02:45:44.370989 1 model_config_utils.cc:1523] ModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds
I0708 02:45:44.370993 1 model_config_utils.cc:1523] ModelConfig::ensemble_scheduling::step::model_version
I0708 02:45:44.370997 1 model_config_utils.cc:1523] ModelConfig::input::dims
I0708 02:45:44.371001 1 model_config_utils.cc:1523] ModelConfig::input::reshape::shape
I0708 02:45:44.371005 1 model_config_utils.cc:1523] ModelConfig::instance_group::secondary_devices::device_id
I0708 02:45:44.371009 1 model_config_utils.cc:1523] ModelConfig::model_warmup::inputs::value::dims
I0708 02:45:44.371013 1 model_config_utils.cc:1523] ModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim
I0708 02:45:44.371017 1 model_config_utils.cc:1523] ModelConfig::optimization::cuda::graph_spec::input::value::dim
I0708 02:45:44.371021 1 model_config_utils.cc:1523] ModelConfig::output::dims
I0708 02:45:44.371026 1 model_config_utils.cc:1523] ModelConfig::output::reshape::shape
I0708 02:45:44.371030 1 model_config_utils.cc:1523] ModelConfig::sequence_batching::direct::max_queue_delay_microseconds
I0708 02:45:44.371034 1 model_config_utils.cc:1523] ModelConfig::sequence_batching::max_sequence_idle_microseconds
I0708 02:45:44.371038 1 model_config_utils.cc:1523] ModelConfig::sequence_batching::oldest::max_queue_delay_microseconds
I0708 02:45:44.371042 1 model_config_utils.cc:1523] ModelConfig::version_policy::specific::versions
I0708 02:45:44.371145 1 python.cc:1267] Using Python execution env /models/gpt2/gpt2.tar.gz
I0708 02:45:44.372722 1 python.cc:1489] TRITONBACKEND_ModelInstanceInitialize: gpt2_0 (GPU device 0)
I0708 02:45:44.374147 1 backend_model_instance.cc:105] Creating instance gpt2_0 on GPU 0 (7.0) using artifact ''
I0708 02:46:02.783664 56 python.cc:918] Starting Python backend stub: export LD_LIBRARY_PATH=/tmp/python_env_nED7ai/0/lib:$LD_LIBRARY_PATH; source /tmp/python_env_nED7ai/0/bin/activate && exec /models/gpt2/triton_python_backend_stub /models/gpt2/1/model.py /gpt2_0_GPU_0 67108864 67108864 1 /opt/tritonserver/backends/python
I0708 02:46:03.845931 1 python.cc:1549] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0708 02:46:03.851969 1 python.cc:1468] TRITONBACKEND_ModelFinalize: delete model state
I0708 02:46:03.852005 1 triton_backend_manager.cc:101] unloading backend 'python'
I0708 02:46:03.852011 1 python.cc:1425] TRITONBACKEND_Finalize: Start
I0708 02:46:04.577935 1 python.cc:1430] TRITONBACKEND_Finalize: End
E0708 02:46:04.579122 1 model_repository_manager.cc:1215] failed to load 'gpt2' version 1: Internal: Failed to initialize stub, stub process exited unexpectedly: gpt2_0
I0708 02:46:04.579144 1 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 0
I0708 02:46:04.579153 1 model_repository_manager.cc:1003] no next action, trigger OnComplete()
I0708 02:46:04.579229 1 model_repository_manager.cc:594] VersionStates() 'gpt2'
I0708 02:46:04.579281 1 model_repository_manager.cc:594] VersionStates() 'gpt2'
I0708 02:46:04.579335 1 server.cc:504]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+
I0708 02:46:04.579402 1 server.cc:543]
+-------------+-----------------------------------------------------------------+--------+
| Backend | Path | Config |
+-------------+-----------------------------------------------------------------+--------+
| tensorrt | <built-in> | {} |
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
| tensorflow | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {} |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} |
| openvino | /opt/tritonserver/backends/openvino/libtriton_openvino.so | {} |
+-------------+-----------------------------------------------------------------+--------+
I0708 02:46:04.579414 1 model_repository_manager.cc:570] BackendStates()
I0708 02:46:04.579444 1 server.cc:586]
+-------+---------+--------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-------+---------+--------------------------------------------------------------------------------------------+
| gpt2 | 1 | UNAVAILABLE: Internal: Failed to initialize stub, stub process exited unexpectedly: gpt2_0 |
+-------+---------+--------------------------------------------------------------------------------------------+
I0708 02:46:04.579550 1 tritonserver.cc:1718]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.11.0 |
| server_extensions | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0] | /models |
| model_control_mode | MODE_NONE |
| strict_model_config | 1 |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0708 02:46:04.579569 1 server.cc:234] Waiting for in-flight requests to complete.
I0708 02:46:04.579574 1 model_repository_manager.cc:694] AsyncUnload() 'gpt2'
I0708 02:46:04.579579 1 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 2
I0708 02:46:04.579584 1 model_repository_manager.cc:1071] Unload() 'gpt2' version 1
I0708 02:46:04.579591 1 model_repository_manager.cc:534] LiveBackendStates()
I0708 02:46:04.579595 1 server.cc:249] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
I0708 02:46:04.579602 1 triton_backend_manager.cc:101] unloading backend 'pytorch'
I0708 02:46:04.579613 1 triton_backend_manager.cc:101] unloading backend 'tensorflow'
I0708 02:46:04.579631 1 triton_backend_manager.cc:101] unloading backend 'onnxruntime'
I0708 02:46:04.579660 1 triton_backend_manager.cc:101] unloading backend 'openvino'
error: creating server: Internal - failed to load all models
Issue Analytics
- State:
- Created 2 years ago
- Comments:14 (7 by maintainers)
Top Results From Across the Web
GitHub - triton-inference-server/python_backend
For example, compiling the Python backend stub on an OS other than Ubuntu 20.04 can lead to unexpected errors. Error Handling. If there...
Read more >unittest.mock — getting started — Python 3.11.1 documentation
Another common use case is to pass an object into a method (or some part of the system under test) and then check...
Read more >Bug listing with status UNCONFIRMED as at 2022/12/21 18 ...
Child process exited with code 1" status:UNCONFIRMED resolution: severity: ... Bug:834944 - "sys-libs/compiler-rt-13.0.1 fatal error: 'gnu/stubs-32.h' file ...
Read more >WebLogic Server Known and Resolved ... - Oracle Help Center
Change Request Number Found In Fixed In
CR200627 9.0
CR208373 9.0 9.2
CR214481 9.0 9.2
Read more >General Error Messages | InterSystems Error Reference
WIJ file. 473, No entry found pointing at local directory %1 in the [Databases] section of the configuration. 474, Unable to obtain lock ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks for the detailed info. I have filed a bug against myself to investigate why this doesn’t happen in the cases that you have shared.
@Tabrizian
I have found that ‘shared memory’ error is triggered in a certain condition only in which GPU is passed to a container and the model’s
kind
isKIND_GPU
.