question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Torch ops unable to load "image.so"

See original GitHub issue

🐛 Describe the bug

Torch ops unable to load “image.so”.

I’m able to reproduce this within the following Docker image:

FROM nvidia/cuda:11.3.0-base-ubuntu20.04

ARG DEBIAN_FRONTEND="noninteractive"
ENV TZ="America/Los_Angeles"

RUN rm -rf /etc/apt/sources.list.d/* \
    && apt-get update \
    && apt-get install -y \
        build-essential \
        curl \
        wget \
        git \
        zip \
        libssl-dev \
        software-properties-common \
        libffi-dev \
        python3-dev \
        python3-pip \
        python3-setuptools \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /work/

RUN pip3 install --no-cache-dir virtualenv && \
    virtualenv -p $(which python3) --copies --reset-app-data .venv && \
    .venv/bin/pip install --no-cache-dir torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

ENTRYPOINT [".venv/bin/python", "-c", "import torch; torch.ops.load_library('.venv/lib/python3.8/site-packages/torchvision/image.so')"]

Build and run with:

docker build -t test .
docker run --gpus all --rm test

This prints out:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/work/.venv/lib/python3.8/site-packages/torch/_ops.py", line 110, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.8/ctypes/__init__.py", line 373, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libnvjpeg.so.11: cannot open shared object file: No such file or directory

Versions

Collecting environment information… PyTorch version: 1.10.0+cu113 Is debug build: False CUDA used to build PyTorch: 11.3 ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.3 LTS (x86_64) GCC version: (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 Clang version: Could not collect CMake version: Could not collect Libc version: glibc-2.31

Python version: 3.8.10 (default, Sep 28 2021, 16:10:42) [GCC 9.3.0] (64-bit runtime) Python platform: Linux-5.11.0-1020-gcp-x86_64-with-glibc2.29 Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: NVIDIA A100-SXM4-40GB Nvidia driver version: 470.63.01 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.21.3 [pip3] torch==1.10.0+cu113 [pip3] torchaudio==0.10.0+cu113 [pip3] torchvision==0.11.1+cu113 [conda] Could not collect

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:9 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
malfetcommented, Oct 26, 2021

I’ve fixed the regression by https://github.com/pytorch/vision/pull/4752, added extra testing in https://github.com/pytorch/pytorch.github.io/pull/872 and push new cu113 wheel binaries to download.pytorch.org. @epwalsh can you please try again on your end and let me know if problem is resolved

1reaction
epwalshcommented, Oct 26, 2021

Sure, I just tried and still got the same error. Here is the full Dockerfile:

FROM nvidia/cuda:11.1-base-ubuntu20.04

ARG DEBIAN_FRONTEND="noninteractive"
ENV TZ="America/Los_Angeles"

RUN rm -rf /etc/apt/sources.list.d/* \
    && apt-get update \
    && apt-get install -y \
        build-essential \
        curl \
        wget \
        git \
        zip \
        libssl-dev \
        software-properties-common \
        libffi-dev \
        python3-dev \
        python3-pip \
        python3-setuptools \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /work/

RUN pip3 install --no-cache-dir virtualenv && \
    virtualenv -p $(which python3) --copies --reset-app-data .venv && \
    .venv/bin/pip install --no-cache-dir torch==1.10.0+cu113 torchvision==0.11.1+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html

ENTRYPOINT [".venv/bin/python", "-c", "import torch; torch.ops.load_library('.venv/lib/python3.8/site-packages/torchvision/image.so')"]
Read more comments on GitHub >

github_iconTop Results From Across the Web

Failed to load image Python extension: Could not find module
I was trying to find the problem doing some tests and I removed all torchvision references leaving only the import and it still...
Read more >
Nvidia Jetson Xavier - fails to load image Python extension ...
Nvidia Jetson Xavier - fails to load image Python extension and Couldn't load custom C++ ops when drawing bounding boxes #80576.
Read more >
warn(f“Failed to load image Python extension: - Blog - ioDraw
0, And I installed 0.12.0, So report an error . 2,pip3 uninstall torchvision uninstall torchvision. 3,pip3 install torchvision==0.10.0 Specify ...
Read more >
PyTorch crashes when training: probable image decoding ...
Try loading in the image as a PIL image or cv2 image and then converting to tensor using ToTensor(). Lets see if that...
Read more >
Getting Started with PyTorch Image Models (timm)
We can convert a model to TorchScript in two different ways: Tracing: runs the code, records the operations that happen and constructs a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found