question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Serve TensorRT or torch2trt model

See original GitHub issue

TensorRT can decrease the latency dramatically on some model, especially when batchsize=1.

torch2trt is a PyTorch to TensorRT converter which utilizes the TensorRT Python API. It can simple convert the model to tensorRT in 1 line of code, and run it with Pytorch input/output. see https://github.com/NVIDIA-AI-IOT/torch2trt.

I am wondering if

  1. Is there any risk to serve a tensorrt or torch2trt model by torchserve?
  2. Will it be an official support for serving tensorRT model?

Describe the solution

It seems that torchserve can serve torch2trt model pretty well, simply by rewriting the handler like this.

from torch2trt import TRTModule

class Yolov5FaceHandler(BaseHandler):
    def initialize(self, context):
        serialized_file = context.manifest["model"]["serializedFile"]
        if serialized_file.split(".")[-1] == "torch2trt": #if serializedFile ends with .torch2trt instead of .pt
            self._load_torchscript_model = self._load_torch2trt_model # overwrite load model function
        self.super().initializer(context)

    def _load_torch2trt_model(self, torch2trt_path):
        logger.info("Loading torch2trt model")
        model_trt = TRTModule()
        model_trt.load_state_dict(torch.load(torch2trt_path))
        return model_trt

Describe alternatives solution

Maybe this feature can be add to ts/torch_handler/base_handler.py? Or there would be a new exemplar handler for it.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:5
  • Comments:11 (5 by maintainers)

github_iconTop GitHub Comments

10reactions
pallashadowcommented, Feb 16, 2022

I created a github repo, with self._load_torchscript_model overwritten trick mentioned above. But It’s a production ready demo with Yolov5_face + Torchserve + TensorRT + Docker. https://github.com/pallashadow/yolov5face_torchserve_tensorrt

3reactions
pallashadowcommented, Feb 8, 2022

@msaroufim I’d like to. I have utilized torch2trt with torchserve in production environment for months. It worked well. Maybe I can try to write an example on yolov5 object detection with torch2trt.

Read more comments on GitHub >

github_iconTop Results From Across the Web

NVIDIA-AI-IOT/torch2trt vs NVIDIA / Torch-TensorRT
I used to NVIDIA-AI-IOT/torch2trt in my projects. But, I noticed that There is an another repository on github called NVIDIA / Torch-TensorRT.
Read more >
Serving a Torch-TensorRT model with Triton - PyTorch
Let's discuss step-by-step, the process of optimizing a model with Torch-TensorRT, deploying it on Triton Inference Server, and building a client to query...
Read more >
Basic Usage - torch2trt - GitHub Pages
import torch from torch2trt import torch2trt from torchvision.models.alexnet import alexnet # create some regular pytorch model... model ...
Read more >
How to convert pytorch model to TensorRT? - Stack Overflow
The best way to achieve the way is to export the Onnx model from Pytorch. Next, use the TensorRT tool, trtexec , which...
Read more >
torch2trt vs tensorrt_demos - compare differences and reviews?
and using an actor that runs inference with a quantized model or optimized with tensorrt (github.com/NVIDIA-AI-IOT/torch2trt).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found