Trition local build (branch r21.06) with docker failed
See original GitHub issueDescription Build triton locally with below command failed.
./build.py --cmake-dir=$(pwd)/build --build-dir=/tmp/citritonbuild --enable-logging --enable-stats --enable-tracing --enable-metrics --filesystem=azure_storage --endpoint=http --endpoint=grpc --repo-tag=common:main --repo-tag=core:main --repo-tag=backend:main --repo-tag=thirdparty:main --backend=ensemble --backend=identity:main --backend=repeat:main --backend=tensorflow2:main --backend=python:main --repoagent=checksum:main
Step 17/20 : RUN apt-get update && apt-get install -y datacenter-gpu-manager
---> Running in 668ef74dd180
Ign:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 InRelease
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 Release
Hit:3 http://security.ubuntu.com/ubuntu focal-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu focal InRelease
Hit:6 https://apt.kitware.com/ubuntu focal InRelease
Hit:7 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:8 http://archive.ubuntu.com/ubuntu focal-backports InRelease
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
datacenter-gpu-manager
0 upgraded, 1 newly installed, 0 to remove and 40 not upgraded.
Need to get 193 MB of archives.
After this operation, 437 MB of additional disk space will be used.
Get:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64 datacenter-gpu-manager 1:2.2.9 [193 MB]
Fetched 193 MB in 2s (106 MB/s)
Selecting previously unselected package datacenter-gpu-manager.
(Reading database ... 43539 files and directories currently installed.)
Preparing to unpack .../datacenter-gpu-manager_1%3a2.2.9_amd64.deb ...
Unpacking datacenter-gpu-manager (1:2.2.9) ...
Setting up datacenter-gpu-manager (1:2.2.9) ...
Removing intermediate container 668ef74dd180
---> 869ad5224acb
Step 18/20 : RUN patch -ruN -d /usr/include/ < /workspace/build/libdcgm/dcgm_api_export.patch
---> Running in 461b5e1d67e0
The next patch would create the file dcgm_api_export.h,
which already exists! Assume -R? [n]
Apply anyway? [n]
Skipping patch.
1 out of 1 hunk ignored
The command '/bin/sh -c patch -ruN -d /usr/include/ < /workspace/build/libdcgm/dcgm_api_export.patch' returned a non-zero code: 1
Triton Information branch r21.06
Are you using the Triton container or did you build it yourself? Trying to build the container locally
To Reproduce Steps to reproduce the behavior. As above.
Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). N/A
Expected behavior A clear and concise description of what you expected to happen. Build can pass.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
@jbkyang-nvi had fixed this recently and will point you to a patch to resolve the issue.
The problem was caused by using the latest dcgm release instead of a fixed release. The latest release no longer requires the patch mentioned before.
This means if I want to build
r21.06
and repro something locally, e.g. I want to have a debug build to repro and investigate https://github.com/triton-inference-server/server/issues/3119, I will hit this build break and look for the patch to workaround it. Is this pain expected to users? 😉