Torch ops unable to load "image.so"
See original GitHub issue🐛 Describe the bug
Torch ops unable to load “image.so”.
I’m able to reproduce this within the following Docker image:
FROM nvidia/cuda:11.3.0-base-ubuntu20.04
ARG DEBIAN_FRONTEND="noninteractive"
ENV TZ="America/Los_Angeles"
RUN rm -rf /etc/apt/sources.list.d/* \
&& apt-get update \
&& apt-get install -y \
build-essential \
curl \
wget \
git \
zip \
libssl-dev \
software-properties-common \
libffi-dev \
python3-dev \
python3-pip \
python3-setuptools \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /work/
RUN pip3 install --no-cache-dir virtualenv && \
virtualenv -p $(which python3) --copies --reset-app-data .venv && \
.venv/bin/pip install --no-cache-dir torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
ENTRYPOINT [".venv/bin/python", "-c", "import torch; torch.ops.load_library('.venv/lib/python3.8/site-packages/torchvision/image.so')"]
Build and run with:
docker build -t test .
docker run --gpus all --rm test
This prints out:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/work/.venv/lib/python3.8/site-packages/torch/_ops.py", line 110, in load_library
ctypes.CDLL(path)
File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libnvjpeg.so.11: cannot open shared object file: No such file or directory
Versions
Collecting environment information… PyTorch version: 1.10.0+cu113 Is debug build: False CUDA used to build PyTorch: 11.3 ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.3 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.31
Python version: 3.8.10 (default, Sep 28 2021, 16:10:42) [GCC 9.3.0] (64-bit runtime) Python platform: Linux-5.11.0-1020-gcp-x86_64-with-glibc2.29 Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: NVIDIA A100-SXM4-40GB Nvidia driver version: 470.63.01 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A
Versions of relevant libraries: [pip3] numpy==1.21.3 [pip3] torch==1.10.0+cu113 [pip3] torchaudio==0.10.0+cu113 [pip3] torchvision==0.11.1+cu113 [conda] Could not collect
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (6 by maintainers)
Top GitHub Comments
I’ve fixed the regression by https://github.com/pytorch/vision/pull/4752, added extra testing in https://github.com/pytorch/pytorch.github.io/pull/872 and push new cu113 wheel binaries to download.pytorch.org. @epwalsh can you please try again on your end and let me know if problem is resolved
Sure, I just tried and still got the same error. Here is the full Dockerfile: