Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Trition local build (branch r21.06) with docker failed

See original GitHub issue

Description Build triton locally with below command failed.

./build.py --cmake-dir=$(pwd)/build --build-dir=/tmp/citritonbuild --enable-logging --enable-stats --enable-tracing --enable-metrics --filesystem=azure_storage --endpoint=http --endpoint=grpc --repo-tag=common:main --repo-tag=core:main --repo-tag=backend:main --repo-tag=thirdparty:main --backend=ensemble --backend=identity:main --backend=repeat:main --backend=tensorflow2:main --backend=python:main --repoagent=checksum:main

Step 17/20 : RUN apt-get update && apt-get install -y datacenter-gpu-manager
 ---> Running in 668ef74dd180
Ign:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease
Hit:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Release
Hit:3 http://security.ubuntu.com/ubuntu focal-security InRelease
Hit:4 http://archive.ubuntu.com/ubuntu focal InRelease
Hit:6 https://apt.kitware.com/ubuntu focal InRelease
Hit:7 http://archive.ubuntu.com/ubuntu focal-updates InRelease
Hit:8 http://archive.ubuntu.com/ubuntu focal-backports InRelease
Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
The following NEW packages will be installed:
  datacenter-gpu-manager
0 upgraded, 1 newly installed, 0 to remove and 40 not upgraded.
Need to get 193 MB of archives.
After this operation, 437 MB of additional disk space will be used.
Get:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  datacenter-gpu-manager 1:2.2.9 [193 MB]
Fetched 193 MB in 2s (106 MB/s)
Selecting previously unselected package datacenter-gpu-manager.
(Reading database ... 43539 files and directories currently installed.)
Preparing to unpack .../datacenter-gpu-manager_1%3a2.2.9_amd64.deb ...
Unpacking datacenter-gpu-manager (1:2.2.9) ...
Setting up datacenter-gpu-manager (1:2.2.9) ...
Removing intermediate container 668ef74dd180
 ---> 869ad5224acb
Step 18/20 : RUN patch -ruN -d /usr/include/ < /workspace/build/libdcgm/dcgm_api_export.patch
 ---> Running in 461b5e1d67e0
The next patch would create the file dcgm_api_export.h,
which already exists!  Assume -R? [n]
Apply anyway? [n]
Skipping patch.
1 out of 1 hunk ignored
The command '/bin/sh -c patch -ruN -d /usr/include/ < /workspace/build/libdcgm/dcgm_api_export.patch' returned a non-zero code: 1

Triton Information branch r21.06

Are you using the Triton container or did you build it yourself? Trying to build the container locally

To Reproduce Steps to reproduce the behavior. As above.

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well). N/A

Expected behavior A clear and concise description of what you expected to happen. Build can pass.

Issue Analytics

State:
Created 2 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

CoderHamcommented, Jul 27, 2021

@jbkyang-nvi had fixed this recently and will point you to a patch to resolve the issue.

The problem was caused by using the latest dcgm release instead of a fixed release. The latest release no longer requires the patch mentioned before.

0reactions

NonStatic2014commented, Jul 27, 2021

This means if I want to build r21.06 and repro something locally, e.g. I want to have a debug build to repro and investigate https://github.com/triton-inference-server/server/issues/3119, I will hit this build break and look for the patch to workaround it. Is this pain expected to users? 😉