question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

python backend with custom packages reports error "Internal: Failed to initialize stub, stub process exited unexpectedly"

See original GitHub issue

Description Error when stub and conda-pack is given to the server Internal: Failed to initialize stub, stub process exited unexpectedly

Triton Information triton version 21.06.1 with container

To Reproduce I have followed this to create model

Commands to reproduce

# install packages to use in python_backend
$ conda create -n gpt2 python=3.8
$ conda activate gpt2
$ conda install numpy
$ pip install transformers tokenizers torch conda-pack

# build stub
$ git clone https://github.com/triton-inference-server/python_backend -b r21.06
$ cd python_backend
$ mkdir build && cd build
$ cmake -DTRITON_ENABLE_GPU=ON -DCMAKE_INSTALL_PREFIX:PATH=`pwd`/install ..
$ make triton-python-backend-stub

# can confirm libpython is linked in sub and copy it to model repository
$ ldd triton_python_backend_stub | grep python
        libpython3.8.so.1.0 => /home/ubuntu/miniconda3/envs/gpt2/lib/libpython3.8.so.1.0 (0x00007f97bdbb0000)
$ cp triton_python_backend_stub /path/to/model_repository/gpt2

# copy conda pack to model repository
$ conda-pack
$ cp gpt2.tar.gz /path/to/model_repository/gpt2

Directory structure of model repository

model_repository
└── gpt2
    ├── 1
    │   └── model.py
    └── config.pbtxt
    └── triton_python_backend_stub
    └── gpt2.tar.gz

model_registry/gpt2/1/model.py

import numpy as np

import triton_python_backend_utils as pb_utils

from transformers import GPT2LMHeadModel, GPT2Tokenizer


class TritonPythonModel:

    def initialize(self, args):

        self.model = GPT2LMHeadModel.from_pretrained(
            'gpt2',
            max_length = 128,
            repetition_penalty = 2.0
        )
        self.tokenizer = GPT2Tokenizer.from_pretrained("gpt2")

    def execute(self, requests):

        responses = []

        for request in requests:
            in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
            in_0 = in_0.as_numpy()[0].decode("utf-8")
            input_ids = self.tokenizer(in_0, return_tensors='pt').input_ids

            outputs = self.model.generate(
                input_ids,
                pad_token_id=self.tokenizer.pad_token_id,
                eos_token_id=self.tokenizer.eos_token_id,
                bos_token_id=self.tokenizer.bos_token_id,
                use_cache=True)
            out_0 = self.tokenizer.decode(outputs[0], skip_special_tokens=True)

            out_0 = pb_utils.Tensor("OUTPUT0",
                                    np.array([out_0], dtype=object))

            inference_response = pb_utils.InferenceResponse(
                output_tensors=[out_tensor_0])
            responses.append(inference_response)
        return responses

    def finalize(self):

        print('Cleaning up...')

model_registry/gpt2/config.pbtxt

backend: "python"

max_batch_size: 64


input [
  {
    name: "INPUT0"
    data_type: TYPE_STRING
    dims: [ -1 ]
  }
]

output [
  {
    name: "OUTPUT0"
    data_type: TYPE_STRING
    dims: [ -1 ]
  }
]

dynamic_batching {
  preferred_batch_size: [ 1,2,4,8,16,32,64 ]
  max_queue_delay_microseconds: 30000
}

instance_group [{ count: 1, kind: KIND_GPU }]

parameters: {
  key: "EXECUTION_ENV_PATH",
  value: {string_value: "/models/gpt2/gpt2.tar.gz"}
}

Then run docker containe

$ docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v /path/to/model_repository:/models nvcr.io/nvidia/tritonserver:21.06.1-py3 tritonserver --model-repository=/models --log-verbose 10


=============================
== Triton Inference Server ==
=============================

NVIDIA Release 21.06 (build 24449615)

Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: Legacy NVIDIA Driver detected.  Compatibility mode ENABLED.

I0708 02:45:43.242384 1 metrics.cc:291] Collecting metrics for GPU 0: Tesla V100-PCIE-16GB
I0708 02:45:43.242771 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/pytorch/libtriton_pytorch.so
I0708 02:45:43.578167 1 libtorch.cc:987] TRITONBACKEND_Initialize: pytorch
I0708 02:45:43.578216 1 libtorch.cc:997] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.578222 1 libtorch.cc:1003] 'pytorch' TRITONBACKEND API version: 1.4
I0708 02:45:43.578278 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so
2021-07-08 02:45:43.771931: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0708 02:45:43.815189 1 tensorflow.cc:2165] TRITONBACKEND_Initialize: tensorflow
I0708 02:45:43.815223 1 tensorflow.cc:2175] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.815229 1 tensorflow.cc:2181] 'tensorflow' TRITONBACKEND API version: 1.4
I0708 02:45:43.815234 1 tensorflow.cc:2205] backend configuration:
{}
I0708 02:45:43.815296 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so
I0708 02:45:43.816660 1 onnxruntime.cc:1969] TRITONBACKEND_Initialize: onnxruntime
I0708 02:45:43.816687 1 onnxruntime.cc:1979] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.816692 1 onnxruntime.cc:1985] 'onnxruntime' TRITONBACKEND API version: 1.4
I0708 02:45:43.825961 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/openvino/libtriton_openvino.so
I0708 02:45:43.834638 1 openvino.cc:1188] TRITONBACKEND_Initialize: openvino
I0708 02:45:43.834660 1 openvino.cc:1198] Triton TRITONBACKEND API version: 1.4
I0708 02:45:43.834666 1 openvino.cc:1204] 'openvino' TRITONBACKEND API version: 1.4
I0708 02:45:44.262809 1 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f8694000000' with size 268435456
I0708 02:45:44.263337 1 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0708 02:45:44.264080 1 backend_factory.h:45] Create TritonBackendFactory
I0708 02:45:44.264106 1 plan_backend_factory.cc:49] Create PlanBackendFactory
I0708 02:45:44.264111 1 plan_backend_factory.cc:56] Registering TensorRT Plugins
I0708 02:45:44.264149 1 logging.cc:52] Registered plugin creator - ::BatchTilePlugin_TRT version 1
I0708 02:45:44.264168 1 logging.cc:52] Registered plugin creator - ::BatchedNMS_TRT version 1
I0708 02:45:44.264186 1 logging.cc:52] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
I0708 02:45:44.264194 1 logging.cc:52] Registered plugin creator - ::CoordConvAC version 1
I0708 02:45:44.264202 1 logging.cc:52] Registered plugin creator - ::CropAndResize version 1
I0708 02:45:44.264212 1 logging.cc:52] Registered plugin creator - ::CropAndResizeDynamic version 1
I0708 02:45:44.264220 1 logging.cc:52] Registered plugin creator - ::DetectionLayer_TRT version 1
I0708 02:45:44.264237 1 logging.cc:52] Registered plugin creator - ::FlattenConcat_TRT version 1
I0708 02:45:44.264244 1 logging.cc:52] Registered plugin creator - ::GenerateDetection_TRT version 1
I0708 02:45:44.264254 1 logging.cc:52] Registered plugin creator - ::GridAnchor_TRT version 1
I0708 02:45:44.264273 1 logging.cc:52] Registered plugin creator - ::GridAnchorRect_TRT version 1
I0708 02:45:44.264282 1 logging.cc:52] Registered plugin creator - ::InstanceNormalization_TRT version 1
I0708 02:45:44.264290 1 logging.cc:52] Registered plugin creator - ::LReLU_TRT version 1
I0708 02:45:44.264303 1 logging.cc:52] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
I0708 02:45:44.264319 1 logging.cc:52] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
I0708 02:45:44.264333 1 logging.cc:52] Registered plugin creator - ::NMS_TRT version 1
I0708 02:45:44.264341 1 logging.cc:52] Registered plugin creator - ::NMSDynamic_TRT version 1
I0708 02:45:44.264348 1 logging.cc:52] Registered plugin creator - ::Normalize_TRT version 1
I0708 02:45:44.264357 1 logging.cc:52] Registered plugin creator - ::PriorBox_TRT version 1
I0708 02:45:44.264369 1 logging.cc:52] Registered plugin creator - ::ProposalLayer_TRT version 1
I0708 02:45:44.264378 1 logging.cc:52] Registered plugin creator - ::Proposal version 1
I0708 02:45:44.264388 1 logging.cc:52] Registered plugin creator - ::ProposalDynamic version 1
I0708 02:45:44.264396 1 logging.cc:52] Registered plugin creator - ::PyramidROIAlign_TRT version 1
I0708 02:45:44.264407 1 logging.cc:52] Registered plugin creator - ::Region_TRT version 1
I0708 02:45:44.264415 1 logging.cc:52] Registered plugin creator - ::Reorg_TRT version 1
I0708 02:45:44.264422 1 logging.cc:52] Registered plugin creator - ::ResizeNearest_TRT version 1
I0708 02:45:44.264434 1 logging.cc:52] Registered plugin creator - ::RPROI_TRT version 1
I0708 02:45:44.264441 1 logging.cc:52] Registered plugin creator - ::SpecialSlice_TRT version 1
I0708 02:45:44.264447 1 logging.cc:52] Registered plugin creator - ::Split version 1
I0708 02:45:44.264460 1 ensemble_backend_factory.cc:47] Create EnsembleBackendFactory
I0708 02:45:44.265837 1 model_repository_manager.cc:749] AsyncLoad() 'gpt2'
I0708 02:45:44.265903 1 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 1
I0708 02:45:44.265917 1 model_repository_manager.cc:1026] Load() 'gpt2' version 1
I0708 02:45:44.265921 1 model_repository_manager.cc:1045] loading: gpt2:1
I0708 02:45:44.366670 1 model_repository_manager.cc:1105] CreateInferenceBackend() 'gpt2' version 1
I0708 02:45:44.366793 1 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/python/libtriton_python.so
I0708 02:45:44.369495 1 python.cc:1298] 'python' TRITONBACKEND API version: 1.4
I0708 02:45:44.369516 1 python.cc:1320] backend configuration:
{}
I0708 02:45:44.369527 1 python.cc:1397] shm-default-byte-size=67108864,shm-growth-byte-size=67108864,stub-timeout-seconds=30
I0708 02:45:44.369966 1 python.cc:1445] TRITONBACKEND_ModelInitialize: gpt2 (version 1)
I0708 02:45:44.370963 1 model_config_utils.cc:1521] ModelConfig 64-bit fields:
I0708 02:45:44.370981 1 model_config_utils.cc:1523]     ModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds
I0708 02:45:44.370985 1 model_config_utils.cc:1523]     ModelConfig::dynamic_batching::max_queue_delay_microseconds
I0708 02:45:44.370989 1 model_config_utils.cc:1523]     ModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds
I0708 02:45:44.370993 1 model_config_utils.cc:1523]     ModelConfig::ensemble_scheduling::step::model_version
I0708 02:45:44.370997 1 model_config_utils.cc:1523]     ModelConfig::input::dims
I0708 02:45:44.371001 1 model_config_utils.cc:1523]     ModelConfig::input::reshape::shape
I0708 02:45:44.371005 1 model_config_utils.cc:1523]     ModelConfig::instance_group::secondary_devices::device_id
I0708 02:45:44.371009 1 model_config_utils.cc:1523]     ModelConfig::model_warmup::inputs::value::dims
I0708 02:45:44.371013 1 model_config_utils.cc:1523]     ModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim
I0708 02:45:44.371017 1 model_config_utils.cc:1523]     ModelConfig::optimization::cuda::graph_spec::input::value::dim
I0708 02:45:44.371021 1 model_config_utils.cc:1523]     ModelConfig::output::dims
I0708 02:45:44.371026 1 model_config_utils.cc:1523]     ModelConfig::output::reshape::shape
I0708 02:45:44.371030 1 model_config_utils.cc:1523]     ModelConfig::sequence_batching::direct::max_queue_delay_microseconds
I0708 02:45:44.371034 1 model_config_utils.cc:1523]     ModelConfig::sequence_batching::max_sequence_idle_microseconds
I0708 02:45:44.371038 1 model_config_utils.cc:1523]     ModelConfig::sequence_batching::oldest::max_queue_delay_microseconds
I0708 02:45:44.371042 1 model_config_utils.cc:1523]     ModelConfig::version_policy::specific::versions
I0708 02:45:44.371145 1 python.cc:1267] Using Python execution env /models/gpt2/gpt2.tar.gz
I0708 02:45:44.372722 1 python.cc:1489] TRITONBACKEND_ModelInstanceInitialize: gpt2_0 (GPU device 0)
I0708 02:45:44.374147 1 backend_model_instance.cc:105] Creating instance gpt2_0 on GPU 0 (7.0) using artifact ''
I0708 02:46:02.783664 56 python.cc:918] Starting Python backend stub: export LD_LIBRARY_PATH=/tmp/python_env_nED7ai/0/lib:$LD_LIBRARY_PATH; source /tmp/python_env_nED7ai/0/bin/activate && exec /models/gpt2/triton_python_backend_stub /models/gpt2/1/model.py /gpt2_0_GPU_0 67108864 67108864 1 /opt/tritonserver/backends/python
I0708 02:46:03.845931 1 python.cc:1549] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0708 02:46:03.851969 1 python.cc:1468] TRITONBACKEND_ModelFinalize: delete model state
I0708 02:46:03.852005 1 triton_backend_manager.cc:101] unloading backend 'python'
I0708 02:46:03.852011 1 python.cc:1425] TRITONBACKEND_Finalize: Start
I0708 02:46:04.577935 1 python.cc:1430] TRITONBACKEND_Finalize: End
E0708 02:46:04.579122 1 model_repository_manager.cc:1215] failed to load 'gpt2' version 1: Internal: Failed to initialize stub, stub process exited unexpectedly: gpt2_0
I0708 02:46:04.579144 1 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 0
I0708 02:46:04.579153 1 model_repository_manager.cc:1003] no next action, trigger OnComplete()
I0708 02:46:04.579229 1 model_repository_manager.cc:594] VersionStates() 'gpt2'
I0708 02:46:04.579281 1 model_repository_manager.cc:594] VersionStates() 'gpt2'
I0708 02:46:04.579335 1 server.cc:504]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0708 02:46:04.579402 1 server.cc:543]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| tensorrt    | <built-in>                                                      | {}     |
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0708 02:46:04.579414 1 model_repository_manager.cc:570] BackendStates()
I0708 02:46:04.579444 1 server.cc:586]
+-------+---------+--------------------------------------------------------------------------------------------+
| Model | Version | Status                                                                                     |
+-------+---------+--------------------------------------------------------------------------------------------+
| gpt2  | 1       | UNAVAILABLE: Internal: Failed to initialize stub, stub process exited unexpectedly: gpt2_0 |
+-------+---------+--------------------------------------------------------------------------------------------+

I0708 02:46:04.579550 1 tritonserver.cc:1718]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.11.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /models                                                                                                                                                                                |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0708 02:46:04.579569 1 server.cc:234] Waiting for in-flight requests to complete.
I0708 02:46:04.579574 1 model_repository_manager.cc:694] AsyncUnload() 'gpt2'
I0708 02:46:04.579579 1 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 2
I0708 02:46:04.579584 1 model_repository_manager.cc:1071] Unload() 'gpt2' version 1
I0708 02:46:04.579591 1 model_repository_manager.cc:534] LiveBackendStates()
I0708 02:46:04.579595 1 server.cc:249] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
I0708 02:46:04.579602 1 triton_backend_manager.cc:101] unloading backend 'pytorch'
I0708 02:46:04.579613 1 triton_backend_manager.cc:101] unloading backend 'tensorflow'
I0708 02:46:04.579631 1 triton_backend_manager.cc:101] unloading backend 'onnxruntime'
I0708 02:46:04.579660 1 triton_backend_manager.cc:101] unloading backend 'openvino'
error: creating server: Internal - failed to load all models

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
Tabriziancommented, Jul 13, 2021

Thanks for the detailed info. I have filed a bug against myself to investigate why this doesn’t happen in the cases that you have shared.

0reactions
donggramecommented, Jul 13, 2021

@Tabrizian

  • docker: version 20.10.7, build f0df350
  • OS: ubuntu 18.04.5 LTS

I have found that ‘shared memory’ error is triggered in a certain condition only in which GPU is passed to a container and the model’s kind is KIND_GPU.

GPU pass KIND error triggered
true GPU true
true CPU false (stalled)
false CPU false (stalled)
$ docker run --gpus=1 --rm --name custom-gpu -it custom-image

root@c3d5a4a5ac75:/opt/tritonserver# cat /models/gpt2/config.pbtxt
backend: "python"

max_batch_size: 64


input [
  {
    name: "INPUT0"
    data_type: TYPE_STRING
    dims: [ -1 ]
  }
]

output [
  {
    name: "OUTPUT0"
    data_type: TYPE_STRING
    dims: [ -1 ]
  }
]

dynamic_batching {
  preferred_batch_size: [ 1,2,4,8,16,32,64 ]
  max_queue_delay_microseconds: 30000
}

instance_group [{ count: 1, kind: KIND_GPU }]

version_policy: { latest { num_versions : 1 }}

parameters: {
  key: "EXECUTION_ENV_PATH",
  value: {string_value: "/models/gpt2/gpt2.tar.gz"}
}
root@c3d5a4a5ac75:/opt/tritonserver# tritonserver --model-repository=/models --log-verbose=20
I0713 04:08:20.589012 97 metrics.cc:291] Collecting metrics for GPU 0: Tesla V100-PCIE-16GB
I0713 04:08:20.589398 97 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/pytorch/libtriton_pytorch.so
I0713 04:08:20.931070 97 libtorch.cc:987] TRITONBACKEND_Initialize: pytorch
I0713 04:08:20.931109 97 libtorch.cc:997] Triton TRITONBACKEND API version: 1.4
I0713 04:08:20.931116 97 libtorch.cc:1003] 'pytorch' TRITONBACKEND API version: 1.4
I0713 04:08:20.931163 97 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so
2021-07-13 04:08:21.127982: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0713 04:08:21.173199 97 tensorflow.cc:2165] TRITONBACKEND_Initialize: tensorflow
I0713 04:08:21.173234 97 tensorflow.cc:2175] Triton TRITONBACKEND API version: 1.4
I0713 04:08:21.173247 97 tensorflow.cc:2181] 'tensorflow' TRITONBACKEND API version: 1.4
I0713 04:08:21.173254 97 tensorflow.cc:2205] backend configuration:
{}
I0713 04:08:21.173294 97 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so
I0713 04:08:21.174656 97 onnxruntime.cc:1969] TRITONBACKEND_Initialize: onnxruntime
I0713 04:08:21.174681 97 onnxruntime.cc:1979] Triton TRITONBACKEND API version: 1.4
I0713 04:08:21.174690 97 onnxruntime.cc:1985] 'onnxruntime' TRITONBACKEND API version: 1.4
I0713 04:08:21.184184 97 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/openvino/libtriton_openvino.so
I0713 04:08:21.192876 97 openvino.cc:1188] TRITONBACKEND_Initialize: openvino
I0713 04:08:21.192897 97 openvino.cc:1198] Triton TRITONBACKEND API version: 1.4
I0713 04:08:21.192906 97 openvino.cc:1204] 'openvino' TRITONBACKEND API version: 1.4
I0713 04:08:21.622389 97 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7fa6e4000000' with size 268435456
I0713 04:08:21.622922 97 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0713 04:08:21.623702 97 backend_factory.h:45] Create TritonBackendFactory
I0713 04:08:21.623725 97 plan_backend_factory.cc:49] Create PlanBackendFactory
I0713 04:08:21.623737 97 plan_backend_factory.cc:56] Registering TensorRT Plugins
I0713 04:08:21.623788 97 logging.cc:52] Registered plugin creator - ::BatchTilePlugin_TRT version 1
I0713 04:08:21.623806 97 logging.cc:52] Registered plugin creator - ::BatchedNMS_TRT version 1
I0713 04:08:21.623820 97 logging.cc:52] Registered plugin creator - ::BatchedNMSDynamic_TRT version 1
I0713 04:08:21.623832 97 logging.cc:52] Registered plugin creator - ::CoordConvAC version 1
I0713 04:08:21.623843 97 logging.cc:52] Registered plugin creator - ::CropAndResize version 1
I0713 04:08:21.623856 97 logging.cc:52] Registered plugin creator - ::CropAndResizeDynamic version 1
I0713 04:08:21.623870 97 logging.cc:52] Registered plugin creator - ::DetectionLayer_TRT version 1
I0713 04:08:21.623914 97 logging.cc:52] Registered plugin creator - ::FlattenConcat_TRT version 1
I0713 04:08:21.623925 97 logging.cc:52] Registered plugin creator - ::GenerateDetection_TRT version 1
I0713 04:08:21.623942 97 logging.cc:52] Registered plugin creator - ::GridAnchor_TRT version 1
I0713 04:08:21.623954 97 logging.cc:52] Registered plugin creator - ::GridAnchorRect_TRT version 1
I0713 04:08:21.623968 97 logging.cc:52] Registered plugin creator - ::InstanceNormalization_TRT version 1
I0713 04:08:21.623981 97 logging.cc:52] Registered plugin creator - ::LReLU_TRT version 1
I0713 04:08:21.623995 97 logging.cc:52] Registered plugin creator - ::MultilevelCropAndResize_TRT version 1
I0713 04:08:21.624022 97 logging.cc:52] Registered plugin creator - ::MultilevelProposeROI_TRT version 1
I0713 04:08:21.624040 97 logging.cc:52] Registered plugin creator - ::NMS_TRT version 1
I0713 04:08:21.624050 97 logging.cc:52] Registered plugin creator - ::NMSDynamic_TRT version 1
I0713 04:08:21.624062 97 logging.cc:52] Registered plugin creator - ::Normalize_TRT version 1
I0713 04:08:21.624076 97 logging.cc:52] Registered plugin creator - ::PriorBox_TRT version 1
I0713 04:08:21.624088 97 logging.cc:52] Registered plugin creator - ::ProposalLayer_TRT version 1
I0713 04:08:21.624099 97 logging.cc:52] Registered plugin creator - ::Proposal version 1
I0713 04:08:21.624110 97 logging.cc:52] Registered plugin creator - ::ProposalDynamic version 1
I0713 04:08:21.624126 97 logging.cc:52] Registered plugin creator - ::PyramidROIAlign_TRT version 1
I0713 04:08:21.624136 97 logging.cc:52] Registered plugin creator - ::Region_TRT version 1
I0713 04:08:21.624148 97 logging.cc:52] Registered plugin creator - ::Reorg_TRT version 1
I0713 04:08:21.624161 97 logging.cc:52] Registered plugin creator - ::ResizeNearest_TRT version 1
I0713 04:08:21.624175 97 logging.cc:52] Registered plugin creator - ::RPROI_TRT version 1
I0713 04:08:21.624187 97 logging.cc:52] Registered plugin creator - ::SpecialSlice_TRT version 1
I0713 04:08:21.624200 97 logging.cc:52] Registered plugin creator - ::Split version 1
I0713 04:08:21.624210 97 ensemble_backend_factory.cc:47] Create EnsembleBackendFactory
I0713 04:08:21.625643 97 model_repository_manager.cc:749] AsyncLoad() 'gpt2'
I0713 04:08:21.625723 97 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 1
I0713 04:08:21.625738 97 model_repository_manager.cc:1026] Load() 'gpt2' version 1
I0713 04:08:21.625748 97 model_repository_manager.cc:1045] loading: gpt2:1
I0713 04:08:21.726476 97 model_repository_manager.cc:1105] CreateInferenceBackend() 'gpt2' version 1
I0713 04:08:21.726559 97 shared_library.cc:108] OpenLibraryHandle: /opt/tritonserver/backends/python/libtriton_python.so
I0713 04:08:21.728570 97 python.cc:1305] 'python' TRITONBACKEND API version: 1.4
I0713 04:08:21.728590 97 python.cc:1327] backend configuration:
{}
I0713 04:08:21.728614 97 python.cc:1404] shm-default-byte-size=67108864,shm-growth-byte-size=67108864,stub-timeout-seconds=30
I0713 04:08:21.729038 97 python.cc:1452] TRITONBACKEND_ModelInitialize: gpt2 (version 1)
I0713 04:08:21.729926 97 model_config_utils.cc:1521] ModelConfig 64-bit fields:
I0713 04:08:21.729947 97 model_config_utils.cc:1523]    ModelConfig::dynamic_batching::default_queue_policy::default_timeout_microseconds
I0713 04:08:21.729956 97 model_config_utils.cc:1523]    ModelConfig::dynamic_batching::max_queue_delay_microseconds
I0713 04:08:21.729967 97 model_config_utils.cc:1523]    ModelConfig::dynamic_batching::priority_queue_policy::value::default_timeout_microseconds
I0713 04:08:21.729975 97 model_config_utils.cc:1523]    ModelConfig::ensemble_scheduling::step::model_version
I0713 04:08:21.729982 97 model_config_utils.cc:1523]    ModelConfig::input::dims
I0713 04:08:21.729989 97 model_config_utils.cc:1523]    ModelConfig::input::reshape::shape
I0713 04:08:21.729995 97 model_config_utils.cc:1523]    ModelConfig::instance_group::secondary_devices::device_id
I0713 04:08:21.730002 97 model_config_utils.cc:1523]    ModelConfig::model_warmup::inputs::value::dims
I0713 04:08:21.730011 97 model_config_utils.cc:1523]    ModelConfig::optimization::cuda::graph_spec::graph_lower_bound::input::value::dim
I0713 04:08:21.730019 97 model_config_utils.cc:1523]    ModelConfig::optimization::cuda::graph_spec::input::value::dim
I0713 04:08:21.730025 97 model_config_utils.cc:1523]    ModelConfig::output::dims
I0713 04:08:21.730033 97 model_config_utils.cc:1523]    ModelConfig::output::reshape::shape
I0713 04:08:21.730041 97 model_config_utils.cc:1523]    ModelConfig::sequence_batching::direct::max_queue_delay_microseconds
I0713 04:08:21.730051 97 model_config_utils.cc:1523]    ModelConfig::sequence_batching::max_sequence_idle_microseconds
I0713 04:08:21.730058 97 model_config_utils.cc:1523]    ModelConfig::sequence_batching::oldest::max_queue_delay_microseconds
I0713 04:08:21.730065 97 model_config_utils.cc:1523]    ModelConfig::version_policy::specific::versions
I0713 04:08:21.730227 97 python.cc:1274] Using Python execution env /models/gpt2/gpt2.tar.gz
I0713 04:08:21.731733 97 python.cc:1496] TRITONBACKEND_ModelInstanceInitialize: gpt2_0 (GPU device 0)
I0713 04:08:21.733169 97 backend_model_instance.cc:105] Creating instance gpt2_0 on GPU 0 (7.0) using artifact ''
zI0713 04:08:25.131388 97 python.cc:1556] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0713 04:08:25.131459 97 python.cc:1475] TRITONBACKEND_ModelFinalize: delete model state
I0713 04:08:25.131495 97 triton_backend_manager.cc:101] unloading backend 'python'
I0713 04:08:25.131504 97 python.cc:1432] TRITONBACKEND_Finalize: Start
I0713 04:08:25.131656 97 python.cc:1437] TRITONBACKEND_Finalize: End
E0713 04:08:25.132559 97 model_repository_manager.cc:1215] failed to load 'gpt2' version 1: Internal: Unable to initialize shared memory key '/gpt2_0_GPU_0' to requested size (67108864 bytes). If you are running Triton inside docker, use '--shm-size' flag to control the shared memory region size. Each Python backend model instance requires at least 64MBs of shared memory. Flag '--shm-size=5G' should be sufficient for common usecases. Error: File exists
I0713 04:08:25.132578 97 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 0
I0713 04:08:25.132587 97 model_repository_manager.cc:1003] no next action, trigger OnComplete()
I0713 04:08:25.132643 97 model_repository_manager.cc:594] VersionStates() 'gpt2'
I0713 04:08:25.132686 97 model_repository_manager.cc:594] VersionStates() 'gpt2'
I0713 04:08:25.132729 97 server.cc:504]
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0713 04:08:25.132793 97 server.cc:543]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| tensorrt    | <built-in>                                                      | {}     |
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0713 04:08:25.132809 97 model_repository_manager.cc:570] BackendStates()
I0713 04:08:25.132846 97 server.cc:586]
+-------+---------+-----------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status                                                                                                                                  |
+-------+---------+-----------------------------------------------------------------------------------------------------------------------------------------+
| gpt2  | 1       | UNAVAILABLE: Internal: Unable to initialize shared memory key '/gpt2_0_GPU_0' to requested size (67108864 bytes). If you are running Tr |
|       |         | iton inside docker, use '--shm-size' flag to control the shared memory region size. Each Python backend model instance requires at leas |
|       |         | t 64MBs of shared memory. Flag '--shm-size=5G' should be sufficient for common usecases. Error: File exists                             |
+-------+---------+-----------------------------------------------------------------------------------------------------------------------------------------+

I0713 04:08:25.132935 97 tritonserver.cc:1718]
+----------------------------------+------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                  |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                 |
| server_version                   | 2.11.0                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration syste |
|                                  | m_shared_memory cuda_shared_memory binary_tensor_data statistics                                                       |
| model_repository_path[0]         | /models                                                                                                                |
| model_control_mode               | MODE_NONE                                                                                                              |
| strict_model_config              | 1                                                                                                                      |
| pinned_memory_pool_byte_size     | 268435456                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                    |
| strict_readiness                 | 1                                                                                                                      |
| exit_timeout                     | 30                                                                                                                     |
+----------------------------------+------------------------------------------------------------------------------------------------------------------------+

I0713 04:08:25.132975 97 server.cc:234] Waiting for in-flight requests to complete.
I0713 04:08:25.132987 97 model_repository_manager.cc:694] AsyncUnload() 'gpt2'
I0713 04:08:25.132992 97 model_repository_manager.cc:988] TriggerNextAction() 'gpt2' version 1: 2
I0713 04:08:25.132997 97 model_repository_manager.cc:1071] Unload() 'gpt2' version 1
I0713 04:08:25.133004 97 model_repository_manager.cc:534] LiveBackendStates()
I0713 04:08:25.133013 97 server.cc:249] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
I0713 04:08:25.133020 97 triton_backend_manager.cc:101] unloading backend 'pytorch'
I0713 04:08:25.133029 97 triton_backend_manager.cc:101] unloading backend 'tensorflow'
I0713 04:08:25.133046 97 triton_backend_manager.cc:101] unloading backend 'onnxruntime'
I0713 04:08:25.133066 97 triton_backend_manager.cc:101] unloading backend 'openvino'
error: creating server: Internal - failed to load all models
Read more comments on GitHub >

github_iconTop Results From Across the Web

GitHub - triton-inference-server/python_backend
For example, compiling the Python backend stub on an OS other than Ubuntu 20.04 can lead to unexpected errors. Error Handling. If there...
Read more >
unittest.mock — getting started — Python 3.11.1 documentation
Another common use case is to pass an object into a method (or some part of the system under test) and then check...
Read more >
Bug listing with status UNCONFIRMED as at 2022/12/21 18 ...
Child process exited with code 1" status:UNCONFIRMED resolution: severity: ... Bug:834944 - "sys-libs/compiler-rt-13.0.1 fatal error: 'gnu/stubs-32.h' file ...
Read more >
WebLogic Server Known and Resolved ... - Oracle Help Center
Change Request Number Found In Fixed In CR200627 9.0 CR208373 9.0 9.2 CR214481 9.0 9.2
Read more >
General Error Messages | InterSystems Error Reference
WIJ file. 473, No entry found pointing at local directory %1 in the [Databases] section of the configuration. 474, Unable to obtain lock ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found