question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error - "Internal - failed to load all models"

See original GitHub issue

I stayed up for two days, but it unresolved Of course, I also saw related issues.

computer environment

  • os : ubuntu 20.04
  • cuda : 11.1
  • cudnn : 8.0.5
  • tensorflow : 2.4.0
  • triton : 20.12-py3
  • gpu : RTX3090
  • python : 3.8.5

model directory tree image

The model is an efficientdet-d4 based transfer learning model.

model config.pbtxt image

just two line

command docker run --rm --gpus=1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/sub/models:/models nvcr.io/nvidia/tritonserver:20.12-py3 tritonserver --model-repository=/models --strict-model-config=false

I get the same error whether I enter the --gpus option or not

first, I thought it was a version compatibility issue, I tried many times in many cases. second, I thought it was a problem with the rtx30xx, so I checked the cuda and nvidia-driver versions carefully.

But not resolved and leaves an issue to get help.

This is my first time leaving an issue, so let me know right away if the way is wrong

LOGS

docker run --rm --gpus=1 --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v/home/sub/models:/models nvcr.io/nvidia/tritonserver:20.12-py3 tritonserver --model-repository=/models --strict-model-config=false

=============================
== Triton Inference Server ==
=============================

NVIDIA Release 20.12 (build 18156940)

Copyright (c) 2018-2020, NVIDIA CORPORATION.  All rights reserved.

Various files include modifications (c) NVIDIA CORPORATION.  All rights reserved.
NVIDIA modifications are covered by the license terms that apply to the underlying
project or file.

I0225 10:47:54.982586 1 metrics.cc:221] Collecting metrics for GPU 0: GeForce RTX 3090
I0225 10:47:55.235226 1 libtorch.cc:945] TRITONBACKEND_Initialize: pytorch
I0225 10:47:55.235264 1 libtorch.cc:955] Triton TRITONBACKEND API version: 1.0
I0225 10:47:55.235273 1 libtorch.cc:961] 'pytorch' TRITONBACKEND API version: 1.0
2021-02-25 10:47:55.418818: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
I0225 10:47:55.458605 1 tensorflow.cc:1877] TRITONBACKEND_Initialize: tensorflow
I0225 10:47:55.458658 1 tensorflow.cc:1887] Triton TRITONBACKEND API version: 1.0
I0225 10:47:55.458671 1 tensorflow.cc:1893] 'tensorflow' TRITONBACKEND API version: 1.0
I0225 10:47:55.458682 1 tensorflow.cc:1917] backend configuration:
{}
I0225 10:47:55.462070 1 onnxruntime.cc:1715] TRITONBACKEND_Initialize: onnxruntime
I0225 10:47:55.462106 1 onnxruntime.cc:1725] Triton TRITONBACKEND API version: 1.0
I0225 10:47:55.462119 1 onnxruntime.cc:1731] 'onnxruntime' TRITONBACKEND API version: 1.0
I0225 10:47:55.606966 1 pinned_memory_manager.cc:199] Pinned memory pool is created at '0x7fab9e000000' with size 268435456
I0225 10:47:55.607336 1 cuda_memory_manager.cc:103] CUDA memory pool is created on device 0 with size 67108864
I0225 10:47:55.608844 1 model_repository_manager.cc:787] loading: detection:1
I0225 10:47:55.709285 1 tensorflow.cc:1977] TRITONBACKEND_ModelInitialize: detection (version 1)
2021-02-25 10:47:55.711208: I tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/detection/1/model.savedmodel
2021-02-25 10:47:55.767476: I tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve }
2021-02-25 10:47:55.845168: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-02-25 10:47:55.846284: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties: 
name: GeForce RTX 3090 major: 8 minor: 6 memoryClockRate(GHz): 1.755
pciBusID: 0000:1a:00.0
2021-02-25 10:47:55.846302: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-02-25 10:47:55.846321: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-02-25 10:47:55.846344: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-02-25 10:47:55.846358: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-02-25 10:47:55.846371: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11
2021-02-25 10:47:55.846383: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2021-02-25 10:47:55.846395: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-02-25 10:47:55.848297: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1793] Adding visible gpu devices: 0
2021-02-25 10:47:58.378540: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1206] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-02-25 10:47:58.378570: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212]      0 
2021-02-25 10:47:58.378573: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1225] 0:   N 
2021-02-25 10:47:58.381536: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1351] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21728 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3090, pci bus id: 0000:1a:00.0, compute capability: 8.6)
2021-02-25 10:47:58.396908: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55dd97a710d0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2021-02-25 10:47:58.396931: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 3090, Compute Capability 8.6
2021-02-25 10:47:58.417114: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3299990000 Hz
2021-02-25 10:47:58.418693: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa9d27ecfa0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-02-25 10:47:58.418721: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-02-25 10:47:58.652328: I tensorflow/cc/saved_model/loader.cc:379] SavedModel load for tags { serve }; Status: fail. Took 2941125 microseconds.
I0225 10:47:58.699850 1 tensorflow.cc:2003] TRITONBACKEND_ModelFinalize: delete model state
E0225 10:47:58.699886 1 model_repository_manager.cc:963] failed to load 'detection' version 1: Internal: unable to auto-complete model configuration for 'detection', failed to load model: Op type not registered 'DecodeImage' in binary running on ebee40abd21b. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is first accessed.
I0225 10:47:58.700176 1 server.cc:490] 
+-------------+-----------------------------------------------------------------+------+
| Backend     | Config                                                          | Path |
+-------------+-----------------------------------------------------------------+------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}   |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}   |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}   |
+-------------+-----------------------------------------------------------------+------+

I0225 10:47:58.700246 1 server.cc:533] 
+-----------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Model     | Version | Status                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+-----------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| detection | 1       | UNAVAILABLE: Internal: unable to auto-complete model configuration for 'detection', failed to load model: Op type not registered 'DecodeImage' in binary running on ebee40abd21b. Make sure the Op and Kernel are registered in the binary running in this process. Note that if you are loading a saved graph which used ops from tf.contrib, accessing (e.g.) `tf.contrib.resampler` should be done before importing the graph, as contrib ops are lazily registered when the module is |
|           |         |  first accessed.                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+-----------+---------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0225 10:47:58.700324 1 tritonserver.cc:1620] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                              |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                             |
| server_version                   | 2.6.0                                                                                                                                              |
| server_extensions                | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /models                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                          |
| strict_model_config              | 0                                                                                                                                                  |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                          |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                           |
| min_supported_compute_capability | 6.0                                                                                                                                                |
| strict_readiness                 | 1                                                                                                                                                  |
| exit_timeout                     | 30                                                                                                                                                 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------+

I0225 10:47:58.700330 1 server.cc:215] Waiting for in-flight requests to complete.
I0225 10:47:58.700340 1 server.cc:230] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
tanmayv25commented, Mar 4, 2021

This may be because 20.12 uses TF 2.3 and 21.02 uses TF 2.4

1reaction
dnfwlxo11commented, Mar 3, 2021

I solved it through version up, and I don’t know why, but I solved the problem. (version upgrade 20.12 -> 21.02)

close

Read more comments on GitHub >

github_iconTop Results From Across the Web

error: creating server: INTERNAL - failed to load all models ...
I just run a simple demo. The model is downloaded by tensorflow.keras.applications.resnet50 and saved with model.save('./resnet50', save_format ...
Read more >
Triton server died before reaching ready state. Terminating ...
It appears that all the models are initially loading successfully, but then they start unloading and then the process terminates. I have tried ......
Read more >
Security Xray Scan Knife Detection - Seeed Wiki
Knife Detection: An Object Detection Model deployed on Triton Inference Sever based on ... if error: creating server: Internal - failed to load...
Read more >
triton-inference-server启动报Internal - failed to load all models
《2010年计算机专业统考试题数据结构》由会员分享,可在线阅读,更多相关《2010年计算机专业统考试题数据结构(23页珍藏版)》请在人人文库网上搜索。
Read more >
Triton Inference Server: The Basics and a Quick Tutorial
Models that load correctly should display a “ready” status. Models that fail to load will send failure reports describing the cause of the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found