Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

No performance improvement when optimizing models

See original GitHub issue

Description In config.pbtxt file I specify TensorRT optimization but inference performance is the same.

Triton Information 21.05-py3 Running pre-built docker container.

To Reproduce Model config.pbtxt:

name: "VGG16"
platform: "tensorflow_savedmodel"
max_batch_size: 64
input  {
    name: "Input"
    data_type: TYPE_FP32
    dims: [ 224, 224, 3 ]
    format: FORMAT_NHWC
}
output  {
    name: "VGG16"
    data_type: TYPE_FP32
    dims: [ 1000]
    is_shape_tensor: false
}
optimization { execution_accelerators {
  gpu_execution_accelerator : [ {
    name : "tensorrt"
    parameters { key: "precision_mode" value: "FP16" }}]
}}

Expected behavior If I use TensorRT optimization manually / outside de Triton Server container, inference speed improves by an order of magnitude. I expect the same to happen by loading the model in Triton Server.

params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode='FP32')
converter = trt.TrtGraphConverterV2(
   input_saved_model_dir = self.model_dir,
   conversion_params = params)
converter.convert()

Issue Analytics

State:
Created 2 years ago
Comments:7 (4 by maintainers)

Top GitHub Comments

1reaction

deadeyegoodwincommented, Jun 25, 2021

The most reliable path is to apply TF-TRT optimization outside of triton and then use the resulting TF model with triton. If you do that you should see the full performance improvement provided by the TF-TRT optimization. Using TF-TRT optimization “online” in triton is less reliable (as you have seen). When doing the optimization offline be sure to request fp16 precision if that is what you want (as you did in the online specification)

0reactions

deadeyegoodwincommented, Jul 13, 2021

Closing due to inactivity.

Top Results From Across the Web

Techniques for Improving Optimization Model Performance

Three Techniques for Improving Optimization Model Performance · Why is output structure important? · When in doubt, simplify · Minimize the number ...

Performance Improvement is not Performance Optimization

While it is not always true that performance improvement requires performance optimization, it is almost universally true that performance ...

[JIT] traced model with optimization shows no performance ...

The test model I used is resnet from torchvision. I modified it to run only the features extraction (no ave pooling and fc...

How to Optimize a Deep Learning Model | by Zachary Warnes

Hyperparameter optimization is a critical part of deep learning. Just selecting a model is not enough to achieve exceptional performance.

Model Optimization | Machine Learning - Google Developers

You can improve model performance by adding features that encode information not yet encoded by your existing features.