Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to autofill for 'yolov4_nvidia', either all model tensor configuration should specify their dims or none

See original GitHub issue

Description

I am trying to convert the pre-trained Pytorch YOLOV4 (darknet) model to TensorRT INT8 with dynamic batching, to later on deploying it on DS-Triton. I am following the general steps in the same NVIDIA-AI-IOT/yolov4_deepstream, but getting issues first with dynamic dimensions at the ONNX-TRT conversion step, then loading the model on DS-Triton :

Environment

TensorRT Version: 7.2.1 NVIDIA GPU: T4 NVIDIA Driver Version: 450.51.06 CUDA Version: 11.1 CUDNN Version: 8.0.4 Operating System: Ubuntu 18.04 Python Version (if applicable): 1.8 Tensorflow Version (if applicable): PyTorch Version (if applicable): container image nvcr.io/nvidia/pytorch:20.11-py3 Baremetal or Container (if so, version): container image deepstream:5.1-21.02-triton

Relevant Files

YOLOV4 pre-trained model weights and cfg downloaded from https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4.cfg https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights

Steps To Reproduce

Complete Pipeline: Pytoch YOLOV4 (darknet) --> ONNX --> TensorRT --> DeepStream-Triton

Step 1: download cfg file and weights from the above link

Step 2: git clone repository pytorch-YOLOv4 $ sudo git clone https://github.com/Tianxiaomo/pytorch-YOLOv4.git

Step 3: Convert model YOLOV4 Pytoch --> ONNX | Dynamic Batch size

$ sudo docker run --gpus all -it --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v l/pytorch-YOLOv4/:/workspace/pytorch-YOLOv4/ nvcr.io/nvidia/pytorch:20.11-py3
$ cd /workspace/pytorch-YOLOv4
$ python demo_darknet2onnx.py "/workspace/pytorch-YOLOv4/models_cfg_weights/yolov4.cfg" "/workspace/pytorch-YOLOv4/models_cfg_weights/yolov4.weights" "/workspace/pytorch-YOLOv4/data/dog.jpg" -1

Result:

Onnx model exporting done
The model expects input shape:  ['batch_size', 3, 608, 608]
Saved model: yolov4_-1_3_608_608_dynamic.onnx

Step 4: Convert model ONNX --> TensorRT | Dynamic Batch size $ sudo docker run --gpus all -it --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -v /pytorch-YOLOv4/:/workspace/pytorch-YOLOv4/ deepstream:5.1-21.02-triton

$ /usr/src/tensorrt/bin/trtexec --onnx=yolov4_-1_3_608_608_dynamic.onnx --explicitBatch --minShapes=\'data\':1x3x608x608 --optShapes=\'data\':2x3x608x608 --maxShapes=\'data\':8x3x608x608 --workspace=4096 --buildOnly -- saveEngine=yolov4_-1_3_608_608_dynamic_onnx_int8.engine --int8

Note: trtexec automatically overrides the engine shape to: 1x3x608x608 instead of keeping the dynamicbatching

[03/09/2021-22:24:24] [W] Dynamic dimensions required for input: input, but no shapes were provided. Automatically overriding shape to: 1x3x608x608
[03/09/2021-22:24:24] [I] FP32 and INT8 precisions have been specified - more performance might be enabled by additionally specifying --fp16 or --best
[03/09/2021-22:24:25] [W] [TRT] Calibrator is not being used. Users must provide dynamic range for all tensors that are not Int32.
[03/09/2021-22:43:52] [I] [TRT] Detected 1 inputs and 8 output network tensors.

$ /usr/src/tensorrt/bin/trtexec --loadEngine=yolov4_-1_3_608_608_dynamic_onnx_int8.engine --int8 Result BS=1:

.
[03/09/2021-22:48:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::445, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch siz but engine max batch size was: 1
[03/09/2021-22:48:45] [I] Warmup completed 312 queries over 200 ms
[03/09/2021-22:48:45] [I] Timing trace has 4704 queries over 3.0043 s
[03/09/2021-22:48:45] [I] Trace averages of 10 runs:
.
[03/09/2021-22:46:29] [I] Host Latency
[03/09/2021-22:46:29] [I] min: 6.81131 ms (end to end 11.6827 ms)
[03/09/2021-22:46:29] [I] max: 10.3354 ms (end to end 21.7613 ms)
[03/09/2021-22:46:29] [I] mean: 7.02095 ms (end to end 12.1098 ms)
[03/09/2021-22:46:29] [I] median: 7.00833 ms (end to end 12.0729 ms)
[03/09/2021-22:46:29] [I] percentile: 7.2074 ms at 99% (end to end 12.4701 ms at 99%)
[03/09/2021-22:46:29] [I] throughput: 163.949 qps
[03/09/2021-22:46:29] [I] walltime: 3.02533 s
[03/09/2021-22:46:29] [I] Enqueue Time
[03/09/2021-22:46:29] [I] min: 1.49683 ms
[03/09/2021-22:46:29] [I] max: 1.841 ms
[03/09/2021-22:46:29] [I] median: 1.52332 ms
[03/09/2021-22:46:29] [I] GPU Compute
[03/09/2021-22:46:29] [I] min: 5.86343 ms
[03/09/2021-22:46:29] [I] max: 9.38628 ms
[03/09/2021-22:46:29] [I] mean: 6.0721 ms
[03/09/2021-22:46:29] [I] median: 6.05927 ms
[03/09/2021-22:46:29] [I] percentile: 6.25732 ms at 99%
[03/09/2021-22:46:29] [I] total compute time: 3.01176 s

Result BS=2: Error: 03/09/2021-22:48:45] [E] [TRT] Parameter check failed at: engine.cpp::enqueue::445, condition: batchSize > 0 && batchSize <= mEngine.getMaxBatchSize(). Note: Batch size was: 2, but engine max batch size was: 1

Step 5: Config the DS-Triton files as described in the sample NVIDIA-AI-IOT/yolov4_deepstream

Step 6: Run YOLOV4 INT8 mode with Dynamic shapes with DS-Triton $ deepstream-app -c deepstream_app_config_yoloV4.txt Error: “unable to autofill for ‘yolov4_nvidia’, either all model tensor configuration should specify their dims or none”

root@1101333383d9:/workspace/Deepstream_5.1_Triton/samples/configs/deepstream-app-trtis# deepstream-app -c source1_primary_yolov4.txt
I0309 23:25:10.628131 260 metrics.cc:219] Collecting metrics for GPU 0: Tesla T4
I0309 23:25:10.634856 260 metrics.cc:219] Collecting metrics for GPU 1: Tesla T4
I0309 23:25:10.641297 260 metrics.cc:219] Collecting metrics for GPU 2: Tesla T4
I0309 23:25:10.647843 260 metrics.cc:219] Collecting metrics for GPU 3: Tesla T4
I0309 23:25:10.706528 260 pinned_memory_manager.cc:199] Pinned memory pool is created at '0x7febf8000000' with size 268435456
I0309 23:25:10.710959 260 cuda_memory_manager.cc:99] CUDA memory pool is created on device 0 with size 67108864
I0309 23:25:10.710967 260 cuda_memory_manager.cc:99] CUDA memory pool is created on device 1 with size 67108864
I```
0309 23:25:10.710972 260 cuda_memory_manager.cc:99] CUDA memory pool is created on device 2 with size 67108864
I0309 23:25:10.710976 260 cuda_memory_manager.cc:99] CUDA memory pool is created on device 3 with size 67108864
I0309 23:25:10.991848 260 server.cc:141]
.
| Backend | Config | Path |
.
.

I0309 23:25:10.991880 260 server.cc:184]
.
| Model | Version | Status |
.
.

I0309 23:25:10.991971 260 tritonserver.cc:1620]
.
| Option                           | Value                                                                                                                            |
.
| server_id                        | triton                                                                                                                           |
| server_version                   | 2.5.0                                                                                                                            |
| server_extensions                | classification sequence model_repository schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens |
|                                  | or_data statistics                                                                                                               |
| model_repository_path[0]         | /workspace/Deepstream_5.1_Triton/samples/trtis_model_repo                                                                        |
| model_control_mode               | MODE_EXPLICIT                                                                                                                    |
| strict_model_config              | 0                                                                                                                                |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                        |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                         |
| cuda_memory_pool_byte_size{1}    | 67108864                                                                                                                         |
| cuda_memory_pool_byte_size{2}    | 67108864                                                                                                                         |
| cuda_memory_pool_byte_size{3}    | 67108864                                                                                                                         |
| min_supported_compute_capability | 6.0                                                                                                                              |
| strict_readiness                 | 1                                                                                                                                |
| exit_timeout                     | 30                                                                                                                               |
.

E0309 23:25:22.300254 260 model_repository_manager.cc:1705] unable to autofill for 'yolov4_nvidia', either all model tensor configuration should specify their dims or none.
ERROR: infer_trtis_server.cpp:1044 Triton: failed to load model yolov4_nvidia, triton_err_str:Internal, err_msg:failed to load 'yolov4_nvidia', no version is available
ERROR: infer_trtis_backend.cpp:45 failed to load model: yolov4_nvidia, nvinfer error:NVDSINFER_TRTIS_ERROR
ERROR: infer_trtis_backend.cpp:184 failed to initialize backend while ensuring model:yolov4_nvidia ready, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:14.399726167   260 0x564fdec902f0 ERROR          nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in createNNBackend() <infer_trtis_context.cpp:246> [UID = 1]: failed to initialize trtis backend for model:yolov4_nvidia, nvinfer error:NVDSINFER_TRTIS_ERROR
I0309 23:25:22.300489 260 server.cc:280] Waiting for in-flight requests to complete.
I0309 23:25:22.300497 260 server.cc:295] Timeout 30: Found 0 live models and 0 in-flight non-inference requests
0:00:14.399831360   260 0x564fdec902f0 ERROR          nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in initialize() <infer_base_context.cpp:81> [UID = 1]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:14.399843072   260 0x564fdec902f0 WARN           nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Failed to initialize InferTrtIsContext
0:00:14.399868241   260 0x564fdec902f0 WARN           nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Config file path: /workspace/Deepstream_5.1_Triton/samples/configs/deepstream-app-trtis/config_infer_primary_yolov4.txt
0:00:14.400284532   260 0x564fdec902f0 WARN           nvinferserver gstnvinferserver.cpp:460:gst_nvinfer_server_start:<primary_gie> error: gstnvinferserver_impl start failed
** ERROR: <main:655>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to initialize InferTrtIsContext
Debug info: gstnvinferserver_impl.cpp(439): start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie:
Config file path: /workspace/Deepstream_5.1_Triton/samples/configs/deepstream-app-trtis/config_infer_primary_yolov4.txt
ERROR from primary_gie: gstnvinferserver_impl start failed
Debug info: gstnvinferserver.cpp(460): gst_nvinfer_server_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
App run failed

I think the problem is with trtexec. is there a sample/tool that shows how to optimize a YOLO Pytorch-ONNX to TensorRT engine INT8 mode with full INT8 calibration and dynamic input shapes?

Issue Analytics

State:
Created 3 years ago
Comments:11 (5 by maintainers)

Top GitHub Comments

1reaction

vilmaracommented, Mar 11, 2021

Hi @deadeyegoodwin, thanks for your prompt support. I have listed all the input/output tensors and the model was loaded successfully. Now when I deploy the TensorRT engine in INT8 mode with dynamic batch with DS-Triton integration I am facing some performance issues please see below:

Performance with:

BS=1 | count:1 | **PERF: 228.29 (228.06) --> reference performance
BS=1 | count:8 | **PERF: 229.84 (229.77) --> doesn’t increase throughput with instance’s models >1
BS=2 | count:1 | Bach size error --> doesn’t accept BS >1 even though the model was built with dynamic batch size

For some reason the batch size is reset to 1: WARNING from primary_gie: Configuration file batch-size reset to: 1

Convert ONNX --> TRT INT8 with dynamic Batch size /usr/src/tensorrt/bin/trtexec --onnx=yolov4_-1_3_608_608_dynamic.onnx --explicitBatch --minShapes=\'input\':1x3x608x608 --optShapes=\'input\':2x3x608x608 --maxShapes=\'input\':8x3x608x608 --workspace=4096 --buildOnly --saveEngine=yolov4_-1_3_608_608_dynamic_onnx_int8_trtexec_3.engine --int8

Test: BS=1 | count:1 | PERF: 228.29 (228.06):

I0311 02:17:15.262627 1015 model_repository_manager.cc:810] loading: yolov4_nvidia:1
I0311 02:17:26.801693 1015 plan_backend.cc:333] Creating instance yolov4_nvidia_0_0_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:17:27.238965 1015 plan_backend.cc:666] Created instance yolov4_nvidia_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:17:27.239483 1015 plan_backend.cc:333] Creating instance yolov4_nvidia_0_1_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:17:27.678544 1015 plan_backend.cc:666] Created instance yolov4_nvidia_0_1_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:17:27.679557 1015 model_repository_manager.cc:983] successfully loaded 'yolov4_nvidia' version 1
INFO: infer_trtis_backend.cpp:206 TrtISBackend id:1 initialized model: yolov4_nvidia

Runtime commands:
        h: Print this help
        q: Quit

        p: Pause
        r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
      To go back to the tiled display, right-click anywhere on the window.


**PERF:  FPS 0 (Avg)
**PERF:  0.00 (0.00)
** INFO: <bus_callback:181>: Pipeline ready

WARNING from primary_gie: Configuration file batch-size reset to: 1
Debug info: gstnvinferserver_impl.cpp(242): validatePluginConfig (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
** INFO: <bus_callback:167>: Pipeline running

**PERF:  228.84 (228.56)
** INFO: <bus_callback:204>: Received EOS. Exiting ...

Quitting
I0311 02:17:36.583073 1015 model_repository_manager.cc:837] unloading: yolov4_nvidia:1
I0311 02:17:36.586610 1015 server.cc:280] Waiting for in-flight requests to complete.
I0311 02:17:36.586626 1015 server.cc:295] Timeout 30: Found 1 live models and 0 in-flight non-inference requests
I0311 02:17:36.665665 1015 model_repository_manager.cc:966] successfully unloaded 'yolov4_nvidia' version 1
I0311 02:17:37.586708 1015 server.cc:295] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
App run successful

Test: BS=1 | count:8 | PERF: 229.84 (229.77):

I0311 02:18:14.183025 1052 model_repository_manager.cc:810] loading: yolov4_nvidia:1
I0311 02:18:25.803731 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_0_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:26.240897 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:26.241408 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_1_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:26.692433 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_1_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:26.692944 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_2_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:27.134521 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_2_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:27.135032 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_3_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:27.578259 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_3_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:27.578776 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_4_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:28.023862 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_4_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:28.024380 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_5_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:28.469880 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_5_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:28.470395 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_6_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:28.916158 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_6_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:28.916668 1052 plan_backend.cc:333] Creating instance yolov4_nvidia_0_7_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:18:29.366990 1052 plan_backend.cc:666] Created instance yolov4_nvidia_0_7_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:18:29.367929 1052 model_repository_manager.cc:983] successfully loaded 'yolov4_nvidia' version 1
INFO: infer_trtis_backend.cpp:206 TrtISBackend id:1 initialized model: yolov4_nvidia

Runtime commands:
        h: Print this help
        q: Quit

        p: Pause
        r: Resume

NOTE: To expand a source in the 2D tiled display and view object details, left-click on the source.
      To go back to the tiled display, right-click anywhere on the window.


**PERF:  FPS 0 (Avg)
**PERF:  0.00 (0.00)
** INFO: <bus_callback:181>: Pipeline ready

WARNING from primary_gie: Configuration file batch-size reset to: 1
Debug info: gstnvinferserver_impl.cpp(242): validatePluginConfig (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
** INFO: <bus_callback:167>: Pipeline running

**PERF:  229.84 (229.77)
** INFO: <bus_callback:204>: Received EOS. Exiting ...

Quitting
I0311 02:18:38.266869 1052 model_repository_manager.cc:837] unloading: yolov4_nvidia:1
I0311 02:18:38.271023 1052 server.cc:280] Waiting for in-flight requests to complete.
I0311 02:18:38.271037 1052 server.cc:295] Timeout 30: Found 1 live models and 0 in-flight non-inference requests
I0311 02:18:38.605757 1052 model_repository_manager.cc:966] successfully unloaded 'yolov4_nvidia' version 1
I0311 02:18:39.271179 1052 server.cc:295] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
App run successful

Test: - BS=2 | count:1 | Bach size error:

I0311 02:01:09.549223 925 model_repository_manager.cc:810] loading: yolov4_nvidia:1
I0311 02:01:21.094905 925 plan_backend.cc:333] Creating instance yolov4_nvidia_0_0_gpu0 on GPU 0 (7.5) using yolov4_-1_3_608_608_dynamic.onnx_int8_trtexec_3.engine
I0311 02:01:21.532243 925 plan_backend.cc:666] Created instance yolov4_nvidia_0_0_gpu0 on GPU 0 with stream priority 0 and optimization profile default[0];
I0311 02:01:21.533194 925 model_repository_manager.cc:983] successfully loaded 'yolov4_nvidia' version 1
INFO: infer_trtis_backend.cpp:206 TrtISBackend id:1 initialized model: yolov4_nvidia
ERROR: infer_trtis_server.cpp:1130 Triton: Triton inferAsync API call failed, triton_err_str:Invalid argument, err_msg:inference request batch-size must be <= 1 for 'yolov4_nvidia'
ERROR: infer_trtis_backend.cpp:557 TRT-IS async inference failed., nvinfer error:NVDSINFER_TRTIS_ERROR
ERROR: infer_trtis_backend.cpp:336 failed to specify dims when running inference on model:yolov4_nvidia, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:28.722290746   925 0x55d49a4d4ef0 ERROR          nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in specifyBackendDim        s() <infer_trtis_context.cpp:174> [UID = 1]: failed to specify input dims trtis backend for model:yolov4_nvidia, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:28.722328548   925 0x55d49a4d4ef0 ERROR          nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in createNNBackend()         <infer_trtis_context.cpp:251> [UID = 1]: failed to specify trtis backend input dims for model:yolov4_nvidia, nvinfer error:NVDSINFER_TRTIS_ERROR
I0311 02:01:23.826284 925 model_repository_manager.cc:837] unloading: yolov4_nvidia:1
I0311 02:01:23.826837 925 server.cc:280] Waiting for in-flight requests to complete.
I0311 02:01:23.826848 925 server.cc:295] Timeout 30: Found 1 live models and 0 in-flight non-inference requests
I0311 02:01:23.862128 925 model_repository_manager.cc:966] successfully unloaded 'yolov4_nvidia' version 1
I0311 02:01:24.826996 925 server.cc:295] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
0:00:29.723164558   925 0x55d49a4d4ef0 ERROR          nvinferserver gstnvinferserver.cpp:362:gst_nvinfer_server_logger:<primary_gie> nvinferserver[UID 1]: Error in initialize() <inf        er_base_context.cpp:81> [UID = 1]: create nn-backend failed, check config file settings, nvinfer error:NVDSINFER_TRTIS_ERROR
0:00:29.723184206   925 0x55d49a4d4ef0 WARN           nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Failed to initialize InferTrtIsContext
0:00:29.723195838   925 0x55d49a4d4ef0 WARN           nvinferserver gstnvinferserver_impl.cpp:439:start:<primary_gie> error: Config file path: /workspace/Deepstream_5.1_Triton/sampl        es/configs/deepstream-app-trtis/config_infer_primary_yolov4.txt
0:00:29.723748759   925 0x55d49a4d4ef0 WARN           nvinferserver gstnvinferserver.cpp:460:gst_nvinfer_server_start:<primary_gie> error: gstnvinferserver_impl start failed
** ERROR: <main:655>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie: Failed to initialize InferTrtIsContext
Debug info: gstnvinferserver_impl.cpp(439): start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie:
Config file path: /workspace/Deepstream_5.1_Triton/samples/configs/deepstream-app-trtis/config_infer_primary_yolov4.txt
ERROR from primary_gie: gstnvinferserver_impl start failed
Debug info: gstnvinferserver.cpp(460): gst_nvinfer_server_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInferServer:primary_gie
App run failed

Some suggestions or recommendations to debug the performance issues?, I want to take advantage of concurrent and dynamic batching features to boost performance

0reactions

Abdobcommented, Jun 8, 2022

@vilmara I see the deepstream/triton pipeline ran with the yolov4 model, did it produce correct outputs in the output video? I’m getting rectangles in random places in the video where it was running correctly without triton

Top Results From Across the Web

Model tensor shape configuration hints for dynamic batching ...

Hi all, I have a model which can be deployed on TensorRT inference server, and do inference on client. I know there are...

Serving a Torch-TensorRT model with Triton - PyTorch

Let's discuss step-by-step, the process of optimizing a model with Torch-TensorRT, deploying it on Triton Inference Server, and building a client to query...

Triton Inference Server: The Basics and a Quick Tutorial

Learn about the NVIDIA Triton Inference Server, its key features, models and model repositories, client libraries, and get started with a quick tutorial....

Using Triton for production deployment of TensorRT models

This script specifies a lightweight Triton configuration supporting only TensorRT and ensemble backends. Since TensorRT represents by far the ...

V2 Inference Protocol - KServe Documentation Website

Optional, if not specified all outputs produced by the model will be returned using default $request_output settings. Request Input¶. The $request_input JSON ...