Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Triton inference

See original GitHub issue

Hello,

I’m trying to use Triton inf server (tritonserver:21.04-py3) with trt plan from your project. Model - retinaface

Your TRT backend works perfect. docker run -p 18081:18080 -d --gpus 0 -e LOG_LEVEL=INFO -e PYTHONUNBUFFERED=0 -e NUM_WORKERS=1 -e INFERENCE_BACKEND=trt -e FORCE_FP16=True -e DET_NAME=retinaface_r50_v1 -e DET_THRESH=0.6 -e REC_NAME=glint360k_r100FC_1.0 -e REC_IGNORE=False -e REC_BATCH_SIZE=64 -e GA_NAME=genderage_v1 -e GA_IGNORE=False -e KEEP_ALL=True -e MAX_SIZE=1024,780 -e DEF_RETURN_FACE_DATA=True -e DEF_EXTRACT_EMBEDDING=True -e DEF_EXTRACT_GA=True -e DEF_API_VER='1'

But if I try to transfer your processing from https://github.com/SthPhoenix/InsightFace-REST/blob/master/src/api_trt/modules/model_zoo/detectors/retinaface.py#L268 like postprocessing Triton’s result.

dw and dh just empty list for stride16

Would you recommend something?

Triton conf triton_model_config.zip

Jupyter notebook triton_test.zip

Issue Analytics

State:
Created 2 years ago
Comments:8 (5 by maintainers)

Top GitHub Comments

1reaction

SthPhoenixcommented, Jun 3, 2021

Thanks, I will wait for such updates if it intersects with your interests.

close

I’m definitely interested in such updates, though it might take a while )

0reactions

gulldancommented, Jun 3, 2021

I have noticed that Triton for some reason states that model is fp32, but if you compare actual performance of fp32 and fp16 models with Triton perf client difference is obvious.

I haven’t tested it yet since it requires lot of changes to source code, but since now Triton supports python backend and DALI preprocessing it’s really worth a try.

It’ll be not trivial to put all face detection/recognition pipeline into Triton, but it’s very promising, especially considering that some parts of pipeline could be replaced with c++.

Thanks, I will wait for such updates if it intersects with your interests.

Top Results From Across the Web

NVIDIA Triton Inference Server

Triton Inference Server, part of the NVIDIA AI platform, streamlines and standardizes AI inference by enabling teams to deploy, run, and scale trained...

Triton Inference Server - GitHub

Triton Inference Server is an open source inference serving software that streamlines AI inferencing. Triton enables teams to deploy any AI model from ......

Triton Inference Server: The Basics and a Quick Tutorial

Triton uses the concept of a “model,” representing a packaged machine learning algorithm used to perform inference. Triton can access models from a...

Getting Started with NVIDIA Triton Inference Server - YouTube

Triton Inference Server is an open-source inference solution that standardizes model deployment and enables fast and scalable AI in ...

High-performance serving with Triton Inference Server (Preview)

NVIDIA Triton Inference Server is an open-source third-party software that is integrated in Azure Machine Learning. · While Azure Machine ...