question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

No performance improvement when optimizing models

See original GitHub issue

Description In config.pbtxt file I specify TensorRT optimization but inference performance is the same.

Triton Information 21.05-py3 Running pre-built docker container.

To Reproduce Model config.pbtxt:

name: "VGG16"
platform: "tensorflow_savedmodel"
max_batch_size: 64
input  {
    name: "Input"
    data_type: TYPE_FP32
    dims: [ 224, 224, 3 ]
    format: FORMAT_NHWC
}
output  {
    name: "VGG16"
    data_type: TYPE_FP32
    dims: [ 1000]
    is_shape_tensor: false
}
optimization { execution_accelerators {
  gpu_execution_accelerator : [ {
    name : "tensorrt"
    parameters { key: "precision_mode" value: "FP16" }}]
}}

Expected behavior If I use TensorRT optimization manually / outside de Triton Server container, inference speed improves by an order of magnitude. I expect the same to happen by loading the model in Triton Server.

params = trt.DEFAULT_TRT_CONVERSION_PARAMS._replace(precision_mode='FP32')
converter = trt.TrtGraphConverterV2(
   input_saved_model_dir = self.model_dir,
   conversion_params = params)
converter.convert()

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
deadeyegoodwincommented, Jun 25, 2021

The most reliable path is to apply TF-TRT optimization outside of triton and then use the resulting TF model with triton. If you do that you should see the full performance improvement provided by the TF-TRT optimization. Using TF-TRT optimization “online” in triton is less reliable (as you have seen). When doing the optimization offline be sure to request fp16 precision if that is what you want (as you did in the online specification)

0reactions
deadeyegoodwincommented, Jul 13, 2021

Closing due to inactivity.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Techniques for Improving Optimization Model Performance
Three Techniques for Improving Optimization Model Performance · Why is output structure important? · When in doubt, simplify · Minimize the number ...
Read more >
Performance Improvement is not Performance Optimization
While it is not always true that performance improvement requires performance optimization, it is almost universally true that performance ...
Read more >
[JIT] traced model with optimization shows no performance ...
The test model I used is resnet from torchvision. I modified it to run only the features extraction (no ave pooling and fc...
Read more >
How to Optimize a Deep Learning Model | by Zachary Warnes
Hyperparameter optimization is a critical part of deep learning. Just selecting a model is not enough to achieve exceptional performance.
Read more >
Model Optimization | Machine Learning - Google Developers
You can improve model performance by adding features that encode information not yet encoded by your existing features.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found