TensorRT version mismatch for NGC containers 20.11 and 20.10
See original GitHub issueDescription
Loading a tensorrt model plan built with nvcr.io/nvidia/tensorrt:20.11-py3
results in a version mismatch with nvcr.io/nvidia/tritonserver:20.11-py3
resulting in the following startup error message:
inference-server_1 | E1125 10:45:55.243042 1 logging.cc:43] coreReadArchive.cpp (41) - Serialization Error in verifyHeader: 0 (Version tag does not match. Note: Current Version: 96, Serialized Engine Version: 97)
It seems like the NGC tensorrt container is using a slightly more recent version of tensorrt. As of my understanding, the containers with the same tag are supposed to be compatible and use the exact same version. (Which they also do according to the release notes of both containers).
Triton Information 20.11 (reproducible for 20.10)
Are you using the Triton container or did you build it yourself? NGC
To Reproduce Build a minimal tensorrt engine in 20.11 NGC tensorrt and load in 20.11 NGC triton. Quick example: https://github.com/isarsoft/yolov4-triton-tensorrt
Expected behavior TensorRT versions in NGC release of tensorrt and triton should match.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7
Top GitHub Comments
As stated in the repo (im the author) you should be using docker to build your source to avoid such version mismatchs.
Every version of triton uses a very specific TensorRT version that you can look up in the release notes of the documentation. You need to match this exact version.
Accroding to https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html#unique_383286676, my TensorRT version is too low. Thanks for your help.