question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

21.12-py3 Server launching error when hosting a TRT model with custom plugin

See original GitHub issue

Description We want touse Triton server to host models with custom plugins built in the TensorRT 21.12-py3 docker environment, and we get different types of errors.

We create an minimum example reproducing the issue, for this case the error is E0105 05:32:15.146171 1 logging.cc:43] 1: [checkMacros.cpp::catchCudaError::272] Error Code 1: Cuda Runtime (initialization error) Triton Information What version of Triton are you using? 21.12-py3 Are you using the Triton container or did you build it yourself? we use the container directly docker pull nvcr.io/nvidia/tritonserver:21.12-py3 To Reproduce Steps to reproduce the behavior. We create an minimum example reproducing the issue https://github.com/zmy1116/triton_server_custom_plugin_issue_21_12

The model consists of 1 custom plugin layer using https://github.com/NVIDIA/TensorRT/tree/main/samples/python/uff_custom_plugin It takes an input of size 1x10 and clip the values between 0.0 to 0.5.

We use TRT 21.12 environment to build the model engine. docker pull nvcr.io/nvidia/tensorrt:21.12-py3 And we then host it directly using the triton server container

The full procedure is described in the attached repository, to summarize

  1. In TRT docker environment docker run --gpus all -it -p8889:8889 --rm -v /home/ubuntu:/workspace/ubuntu nvcr.io/nvidia/tensorrt:21.12-py3

  2. Build the plugin

git clone https://github.com/zmy1116/triton_server_custom_plugin_issue_21_12
cd triton_server_custom_plugin_issue_21_12/custom_plugin
mkdir build
cd build 
cmake ..
make
cd ../../
  1. create TRT model engine
create_engine.py
  1. organize the produced engine and plugin in a model repository folder and launch triton server
docker run --gpus=all --rm --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -p8000:8000 -p8001:8001 -p8002:8002 -v /home/ubuntu/dummy_repository:/models  -eLD_PRELOAD=/models/dummy/libclipplugin.so nvcr.io/nvidia/tritonserver:21.12-py3 tritonserver --model-repository=/models --strict-model-config=false

And you shall see a full screen of the following errors E0105 05:32:15.146037 1 logging.cc:43] 1: [checkMacros.cpp::catchCudaError::272] Error Code 1: Cuda Runtime (initialization error)

Expected behavior I would expect the server to be launched properly. This is a minimum example I take from the TensorRT examples directly.

For one of our own plugin, we actually see a different error [Torch-TensorRT] - Unable to read CUDA capable devices. Return status

Please let me know if you need additional information.

Thanks

Issue Analytics

  • State:open
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
deadeyegoodwincommented, Jan 21, 2022

We are implementing an improved method to load shared libraries that implement TensorRT custom operations. @tanmayv25 please link this issue to the PR when you submit it.

0reactions
zmy1116commented, Jan 11, 2022

@deadeyegoodwin

  1. Yes, I’ve tested to host models that does not use custom plugins and they work.

  2. Yes, This specific example runs in the tensorRT docker environment directly docker pull nvcr.io/nvidia/tensorrt:21.12-py3

  3. Yes, models can be run on GPU using pytorch/TensorFlow.

Read more comments on GitHub >

github_iconTop Results From Across the Web

https://codecov.io/api/gh/SickChill/SickChill/down...
... .github/workflows/release.yaml .gitignore .lgtm.yml .snyk .travis.yml CONTRIBUTORS.md ... lib3/setuptools/errors.py lib3/setuptools/extension.py ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found