question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

deploying model using trtis is much slower than using frozen model directly

See original GitHub issue

after optimizing yolov3 using tf-tensorrt optimization, i infer in two way.

  1. using trtis (~20fps) + bellow warning
layout failed: Invalid argument: The graph is already optimized by layout optimizer.
  1. using frozen directly (~54 fps)

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
deadeyegoodwincommented, Jan 16, 2020

Are you actually providing a labels.txt? Given the dimension of those outputs the file would need to have 10647*80=850k+ entries… but that isn’t related to your issue.

So you are using the offline TF-TRT conversion to create a graphdef, and then running that in TRTIS and also using some script that uses the TF API to load and run it directly?

What version of TRTIS are you using? Have you tried using perf_client with TRTIS and see what performance that gives you for various concurrency levels? Make sure you read this section of the documentation: https://docs.nvidia.com/deeplearning/sdk/tensorrt-inference-server-guide/docs/optimization.html

0reactions
deadeyegoodwincommented, Feb 3, 2020

Closing. Please reopen with perf_client results if those still show a problem

Read more comments on GitHub >

github_iconTop Results From Across the Web

deploying model using trtis is much slower than using frozen ...
It seems that using the direct optimizer on trtis is available in recent versions. I didn't update documents so I don't know yet....
Read more >
Deploying Models from TensorFlow Model Zoo Using NVIDIA ...
Use the following code examples to optimize your TensorFlow network using TF-TRT, depending on your platform.
Read more >
Optimizing TensorFlow Models for Serving | by Lukman Ramsey
There are several techniques in TensorFlow that allow you to shrink the size of a model and improve prediction latency. Here are some...
Read more >
There are two very different ways to deploy ML models, here's ...
In this article, I'll provide you with a straightforward yet best-practices template for both kinds of deployment. As always, for the ...
Read more >
Tensorflow inference becomes slower and slower due to eval ...
So I have a frozen tensorflow model, which I can use to classify images. When I try to use this model to inference...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found