question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pytorch_backend build fails with `cannot find -ltorch`

See original GitHub issue

Description I’m building the Triton server with ./build.py --no-container-build --target-platform ubuntu --cmake-dir=$(pwd)/build --build-dir=pwd/citritonbuild --enable-logging --enable-stats --enable-tracing --enable-metrics --enable-gpu-metrics --enable-gpu --filesystem=gcs --filesystem=azure_storage --filesystem=s3 --endpoint=http --endpoint=grpc --repo-tag=common:r21.05 --repo-tag=core:r21.05 --repo-tag=backend:r21.05 --repo-tag=thirdparty:r21.05 --backend=ensemble --backend=tensorrt --backend=python --backend=tensorflow1 --backend=tensorflow2 --backend=pytorch --repoagent=checksum --image tensorflow1,nvcr.io/nvidia/tensorflow:21.05-tf1-py3 --image tensorflow2,nvcr.io/nvidia/tensorflow:21.05-tf2-py3 --image pytorch,nvcr.io/nvidia/pytorch:21.05-py3

it fails with

[  9%] Built target kernel-library-new
[ 13%] Built target ptlib_target
[ 27%] Built target triton-common-async-work-queue
[ 36%] Built target triton-core-serverstub
[ 68%] Built target triton-backend-utils
[ 72%] Linking CXX shared library libtriton_pytorch.so
/usr/bin/ld: cannot find -ltorch
/usr/bin/ld: cannot find -ltorchvision
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/triton-pytorch-backend.dir/build.make:121: libtriton_pytorch.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:173: CMakeFiles/triton-pytorch-backend.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

I notice the libtorch.so and libtorchvision.so present in the build folder.

Triton Information What version of Triton are you using? r21.05

Are you using the Triton container or did you build it yourself? I’m building it myself

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
GowthamKudupudicommented, Jul 11, 2021

I’m referring to the build folder citritonbuild/pytorch/build. I see the Extracting pytorch and torchvision libraries and includes from in the build log at [ 13%]. Below is the complete build log:

gowtham@t-trt-ubuntu:~/server/citritonbuild/pytorch/build$ make clean
gowtham@t-trt-ubuntu:~/server/citritonbuild/pytorch/build$ make
[  4%] Building NVCC (Device) object _deps/repo-backend-build/CMakeFiles/kernel-library-new.dir/src/kernel-library-new_generated_kernel.cu.o
[  9%] Linking CXX static library libkernel-library-new.a
[  9%] Built target kernel-library-new
[ 13%] Extracting pytorch and torchvision libraries and includes from nvcr.io/nvidia/pytorch:21.05-py3
21.05-py3: Pulling from nvidia/pytorch
Digest: sha256:a5986639e4cf01eb35c0c0a9ca9fb9c6f905cc1b546966b78de4f69d15b894cf
Status: Image is up to date for nvcr.io/nvidia/pytorch:21.05-py3
nvcr.io/nvidia/pytorch:21.05-py3
Error: No such container: pytorch_backend_ptlib
error ignored...
bcf92ea4bbbd76853d665e06a3c401dc8319a04ae2db5ebb356aa0862629edec
pytorch_backend_ptlib
[ 13%] Built target ptlib_target
[ 18%] Building CXX object _deps/repo-common-build/CMakeFiles/triton-common-async-work-queue.dir/src/async_work_queue.cc.o
[ 22%] Building CXX object _deps/repo-common-build/CMakeFiles/triton-common-async-work-queue.dir/src/error.cc.o
[ 27%] Linking CXX static library libtritonasyncworkqueue.a
[ 27%] Built target triton-common-async-work-queue
[ 31%] Building CXX object _deps/repo-core-build/CMakeFiles/triton-core-serverstub.dir/src/tritonserver_stub.cc.o
[ 36%] Linking CXX shared library libtritonserver_stub.so
[ 36%] Built target triton-core-serverstub
[ 40%] Building CXX object _deps/repo-backend-build/CMakeFiles/triton-backend-utils.dir/src/backend_common.cc.o
[ 45%] Building CXX object _deps/repo-backend-build/CMakeFiles/triton-backend-utils.dir/src/backend_input_collector.cc.o
[ 50%] Building CXX object _deps/repo-backend-build/CMakeFiles/triton-backend-utils.dir/src/backend_memory.cc.o
[ 54%] Building CXX object _deps/repo-backend-build/CMakeFiles/triton-backend-utils.dir/src/backend_model_instance.cc.o
[ 59%] Building CXX object _deps/repo-backend-build/CMakeFiles/triton-backend-utils.dir/src/backend_model.cc.o
[ 63%] Building CXX object _deps/repo-backend-build/CMakeFiles/triton-backend-utils.dir/src/backend_output_responder.cc.o
[ 68%] Linking CXX static library libtritonbackendutils.a
[ 68%] Built target triton-backend-utils
[ 72%] Building CXX object CMakeFiles/triton-pytorch-backend.dir/src/libtorch.cc.o
In file included from /home/gowtham/server/citritonbuild/pytorch/src/libtorch.cc:43:
/home/gowtham/server/citritonbuild/pytorch/build/include/torchvision/torchvision/vision.h:10:40: warning: ‘_register_ops’ initialized and declared ‘extern’
   10 | extern "C" VISION_INLINE_VARIABLE auto _register_ops = &cuda_version;
      |                                        ^~~~~~~~~~~~~
[ 77%] Building CXX object CMakeFiles/triton-pytorch-backend.dir/src/libtorch_utils.cc.o
[ 81%] Linking CXX shared library libtriton_pytorch.so
/usr/bin/ld: cannot find -ltorch
/usr/bin/ld: cannot find -ltorchvision
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/triton-pytorch-backend.dir/build.make:121: libtriton_pytorch.so] Error 1
make[1]: *** [CMakeFiles/Makefile2:173: CMakeFiles/triton-pytorch-backend.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
0reactions
GowthamKudupudicommented, Jul 11, 2021

I fixed the issue by replacing -ltorch and -ltorchvision with ${CMAKE_CURRENT_BINARY_DIR}/libtorch.so and ${CMAKE_CURRENT_BINARY_DIR}/libtorchvision.so in CMakeLists.txt

Read more comments on GitHub >

github_iconTop Results From Across the Web

Build failure when building PyTorch CPU tritonserver ... - GitHub
Description I'm building a cpu-only PyTorch container with the following command, which results in a build failure: python build.py ...
Read more >
No module named "Torch" - python - Stack Overflow
Try to install PyTorch using pip: First create a Conda environment using: conda create -n env_pytorch python=3.6.
Read more >
Help Pytorch build constantly failing, with "Configuring ...
This fails with the following error log: Building wheel torch-1.6.0a0+9111ae7 -- Building version 1.6.0a0+9111ae7 cmake -GNinja ...
Read more >
GLOO/NCCL connection issues [build from source] - distributed
My build of Pytorch v1.10.0 from source seem to have issues with the gloo and nccl backends, but works fine with mpi ....
Read more >
I cannot use the pytorch that was built successfully from source
The error is caused by our poor support for MSVC OpenMP in detectron. Please build with MKL so Intel OpenMP will be used....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found