Error - "Internal - failed to load all models"
See original GitHub issueI stayed up for two days, but it unresolved Of course, I also saw related issues.
computer environment
- os : ubuntu 20.04
- cuda : 11.1
- cudnn : 8.0.5
- tensorflow : 2.4.0
- triton : 20.12-py3
- gpu : RTX3090
- python : 3.8.5
model directory tree
The model is an efficientdet-d4 based transfer learning model.
model config.pbtxt
just two line
command docker run --rm --gpus=1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/sub/models:/models nvcr.io/nvidia/tritonserver:20.12-py3 tritonserver --model-repository=/models --strict-model-config=false
I get the same error whether I enter the --gpus option or not
first, I thought it was a version compatibility issue, I tried many times in many cases. second, I thought it was a problem with the rtx30xx, so I checked the cuda and nvidia-driver versions carefully.
But not resolved and leaves an issue to get help.
This is my first time leaving an issue, so let me know right away if the way is wrong
LOGS
docker run --rm --gpus=1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/sub/models:/models nvcr.io/nvidia/tritonserver:20.12-py3 tritonserver --model-repository=/models --strict-model-config=false
=============================
== Triton Inference Server ==
=============================
NVIDIA Release 20.12 (build 18156940)
Copyright (c) 2018-2020, NVIDIA CORPORATION. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.
I0225 10:47:54.982586 1 metrics.cc:221] Collecting metrics for GPU 0: GeForce RTX 3090
I0225 10:47:55.235226 1 libtorch.cc:945] TRITONBACKEND_Initialize: pytorch
I0225 10:47:55.235264 1 libtorch.cc:955] Triton TRITONBACKEND API version: 1.0
I0225 10:47:55.235273 1 libtorch.cc:961] 'pytorch' TRITONBACKEND API version: 1.0
2021-02-25 10:47:55.418818: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0225 10:47:55.458605 1 tensorflow.cc:1877] TRITONBACKEND_Initialize: tensorflow
I0225 10:47:55.458658 1 tensorflow.cc:1887] Triton TRITONBACKEND API version: 1.0
I0225 10:47:55.458671 1 tensorflow.cc:1893] 'tensorflow' TRITONBACKEND API version: 1.0
I0225 10:47:55.458682 1 tensorflow.cc:1917] backend configuration:
{}
I0225 10:47:55.462070 1 onnxruntime.cc:1715] TRITONBACKEND_Initialize: onnxruntime
I0225 10:47:55.462106 1 onnxruntime.cc:1725] Triton TRITONBACKEND API version: 1.0
I0225 10:47:55.462119 1 onnxruntime.cc:1731] 'onnxruntime' TRITONBACKEND API version: 1.0
I0225 10:47:55.606966 1 pinned_memory_manager.cc:199] Pinned memory pool is created at '0x7fab9e000000' with size 268435456
I0225 10:47:55.607336 1 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 67108864
I0225 10:47:55.608844 1 model_repository_manager.cc:787] loading: detection:1
I0225 10:47:55.709285 1 tensorflow.cc:1977] TRITONBACKEND_ModelInitialize: detection (version 1)
2021-02-25 10:47:55.711208: I tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/detection/1/model.savedmodel
2021-02-25 10:47:55.767476: I tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2021-02-25 10:47:55.845168: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-02-25 10:47:55.846284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties:
name: GeForce RTX 3090 major: 8 minor: 6 memoryClockRate(GHz): 1.755
pciBusID: 0000:1a:00.0
2021-02-25 10:47:55.846302: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-02-25 10:47:55.846321: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-02-25 10:47:55.846344: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-02-25 10:47:55.846358: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-02-25 10:47:55.846371: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11
2021-02-25 10:47:55.846383: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-02-25 10:47:55.846395: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-02-25 10:47:55.848297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Adding visible gpu devices: 0
2021-02-25 10:47:58.378540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-02-25 10:47:58.378570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] 0
2021-02-25 10:47:58.378573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0: N
2021-02-25 10:47:58.381536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21728 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3090, pci bus id: 0000:1a:00.0, compute capability: 8.6)
2021-02-25 10:47:58.396908: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55dd97a710d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-02-25 10:47:58.396931: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 3090, Compute Capability 8.6
2021-02-25 10:47:58.417114: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3299990000 Hz
2021-02-25 10:47:58.418693: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa9d27ecfa0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-02-25 10:47:58.418721: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-02-25 10:47:58.652328: I tensorflow/cc/saved_model/loader.cc:379] SavedModel load for tags { serve }; Status: fail. Took 2941125 microseconds.
I0225 10:47:58.699850 1 tensorflow.cc:2003] TRITONBACKEND_ModelFinalize: delete model state
E0225 10:47:58.699886 1 model_repository_manager.cc:963] failed to load 'detection' version 1: Internal: unable to auto-complete model configuration for 'detection', failed to load model: Op type not registered 'DecodeImage' in binary running on ebee40abd21b. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
I0225 10:47:58.700176 1 server.cc:490]
+-------------+-----------------------------------------------------------------+------+
| Backend | Config | Path |
+-------------+-----------------------------------------------------------------+------+
| pytorch | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so | {} |
| tensorflow | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {} |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {} |
+-------------+-----------------------------------------------------------------+------+
I0225 10:47:58.700246 1 server.cc:533]
+-----------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model | Version | Status |
+-----------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| detection | 1 | UNAVAILABLE: Internal: unable to auto-complete model configuration for 'detection', failed to load model: Op type not registered 'DecodeImage' in binary running on ebee40abd21b. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is |
| | | first accessed. |
+-----------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
I0225 10:47:58.700324 1 tritonserver.cc:1620]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| Option | Value |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id | triton |
| server_version | 2.6.0 |
| server_extensions | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0] | /models |
| model_control_mode | MODE_NONE |
| strict_model_config | 0 |
| pinned_memory_pool_byte_size | 268435456 |
| cuda_memory_pool_byte_size{0} | 67108864 |
| min_supported_compute_capability | 6.0 |
| strict_readiness | 1 |
| exit_timeout | 30 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
I0225 10:47:58.700330 1 server.cc:215] Waiting for in-flight requests to complete.
I0225 10:47:58.700340 1 server.cc:230] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (3 by maintainers)
Top GitHub Comments
This may be because 20.12 uses TF 2.3 and 21.02 uses TF 2.4
I solved it through version up, and I don’t know why, but I solved the problem. (version upgrade 20.12 -> 21.02)
close