question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

My tensorrt model can not be loaded by triton server

See original GitHub issue

error info

E1217 06:50:10.976331 1 logging.cc:43] 1: [stdArchiveReader.cpp::StdArchiveReader::34] Error Code 1: Serialization (Serialization assertion safeVersionRead == safeSerializationVersion failed.Version tag does not match. Note: Current Version: 43, Serialized Engine Version: 0)
E1217 06:50:10.976343 1 logging.cc:43] 4: [runtime.cpp::deserializeCudaEngine::75] Error Code 4: Internal Error (Engine deserialization failed.)

I got my trt model by converting onnx model outside the container. And my trt version is 8.0.3.4

Also I could run my trt model outside the container.

My triton docker image version: 21.11-py3

What should I do to solve it?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
prz30commented, Dec 28, 2021

I meet same problem. It Looks like a TRT version mismatch. I solving this problem by use trtexec command in triton docker. trtexec path in triton docker is /usr/src/bin/trtexec

2reactions
Tabriziancommented, Dec 17, 2021

Looks like a TRT version mismatch. The TRT version that you have used seem to be correct according to the DL Framework Support Matrix. Have you generated the TRT file inside the TensorRT 21.11 container?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Serving TensorRT Models with NVIDIA Triton Inference Server
In real-time AI model deployment en masse, efficiency of model inference and hardware/GPU usage is paramount.
Read more >
NVIDIA Triton Inference Server Container Versions
If you encounter accuracy issues with your TensorRT model, you can work-around the issue byenabling the output_copy_stream option in your ...
Read more >
Serving a Torch-TensorRT model with Triton - PyTorch
Let's discuss step-by-step, the process of optimizing a model with Torch-TensorRT, deploying it on Triton Inference Server, and building a client to query...
Read more >
Serve multiple models with Amazon SageMaker and Triton ...
You can use NVIDIA Triton Inference Server to serve models for ... You can generate your own TensorRT engines according to your needs....
Read more >
Serving Predictions with NVIDIA Triton | Vertex AI
NVIDIA Triton inference server (Triton) is an open-source inference serving ... Specifically, TensorRT, TensorFlow SavedModel, and ONNX models do not ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found