question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Docker Runtime Error: Not Compiled with GPU support

See original GitHub issue

❓ Questions and Help

Hello,

i have a strange Problem with the Docker Image. When I build the Docker Image given the instructions in INSTALL.md and if I then try training on the coco2014 dataset with the command below I get RuntimeError: Not compiled with GPU support(nms at ./maskrcnn_benchmakr/csrc/nms.h:22)

nvidia-docker run --shm-size=8gb -v /home/archdyn/Datasets/coco:/maskrcnn-benchmark/datasets/coco maskrcnn-benchmark python /maskrcnn-benchmark/tools/train_net.py --config-file "/maskrcnn-benchmark/configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 1 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1

But whhen I change the Dockerfile and comment the line python setup.py build develop before WORKDIR /maskrcnn-benchmark out and then execute the line python setup.py build develop inside my built docker container i can train without problems.

My Environment when running the Docker Container:

2018-11-17 20:03:13,889 maskrcnn_benchmark INFO: Collecting env info (might take some time)
2018-11-17 20:03:15,634 maskrcnn_benchmark INFO: 
PyTorch version: 1.0.0.dev20181116
Is debug build: No
CUDA used to build PyTorch: 9.0.176

OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.10) 5.4.0 20160609
CMake version: version 3.5.1

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: 9.0.176
GPU models and configuration: GPU 0: GeForce GTX 850M
Nvidia driver version: 410.73
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.1
/usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a

Versions of relevant libraries:
[pip] Could not collect
[conda] pytorch-nightly           1.0.0.dev20181116 py3.6_cuda9.0.176_cudnn7.1.2_0    pytorch
        Pillow (5.3.0)

Does somebody know why this problem happens?

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:5
  • Comments:57 (41 by maintainers)

github_iconTop GitHub Comments

13reactions
archdyncommented, Nov 21, 2018

You could try what worked for me. Take the following Dockerfile and build it with nvidia-docker. The Dockerfile has just 2 lines removed under install pytorch detection.

ARG CUDA="9.0"
ARG CUDNN="7"

FROM nvidia/cuda:${CUDA}-cudnn${CUDNN}-devel-ubuntu16.04

RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections

# install basics
RUN apt-get update -y \
 && apt-get install -y apt-utils git curl ca-certificates bzip2 cmake tree htop bmon iotop g++

# Install Miniconda
RUN curl -so /miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh \
 && chmod +x /miniconda.sh \
 && /miniconda.sh -b -p /miniconda \
 && rm /miniconda.sh

ENV PATH=/miniconda/bin:$PATH

# Create a Python 3.6 environment
RUN /miniconda/bin/conda install -y conda-build \
 && /miniconda/bin/conda create -y --name py36 python=3.6.7 \
 && /miniconda/bin/conda clean -ya

ENV CONDA_DEFAULT_ENV=py36
ENV CONDA_PREFIX=/miniconda/envs/$CONDA_DEFAULT_ENV
ENV PATH=$CONDA_PREFIX/bin:$PATH
ENV CONDA_AUTO_UPDATE_CONDA=false

RUN conda install -y ipython
RUN pip install ninja yacs cython matplotlib

# Install PyTorch 1.0 Nightly
RUN conda install -y pytorch-nightly -c pytorch && conda clean -ya

# Install TorchVision master
RUN git clone https://github.com/pytorch/vision.git \
 && cd vision \
 && python setup.py install

# install pycocotools
RUN git clone https://github.com/cocodataset/cocoapi.git \
 && cd cocoapi/PythonAPI \
 && python setup.py build_ext install

# install PyTorch Detection
RUN git clone https://github.com/facebookresearch/maskrcnn-benchmark.git

WORKDIR /maskrcnn-benchmark

After that go into the docker container with the following command: nvidia-docker run --rm -it maskrcnn-benchmark bash Now inside the docke container execute this command: python setup.py build develop Now you have to exit the docker container with CTRL + p and after that CTRL + q. This should get you out of the docker container without stopping it. Alternatively you could just open a new console. From the console now execute this: docker commit <CONTAINER ID> maskrcnn-benchmark After all this you should have a working docker image without this Runtime Error. At least for me this worked.

6reactions
zimenglan-sysu-512commented, Nov 21, 2018

hi @archdyn @fmassa @miguelvr when i use the nvidia-docker to build the image, torch.cuda.is_available() returns False. and then after building, i use nvidia-docker to run the image, torch.cuda.is_available() returns True. it is so weird.

with the help of my friend to debug it, finally we find that if we change the line as below:

# if torch.cuda.is_available() and CUDA_HOME is not None:
if CUDA_HOME is not None:

it can solve the problem “RuntimeError: Not compiled with GPU support (nms at /algo_code/maskrcnn_benchmark/csrc/nms.h:22)”, when run the image.

although i have no idea why torch.cuda.is_available() return False in building time.

thanks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

RuntimeError: Not compiled with GPU support - Stack Overflow
I'm trying to run a code and I got the error: RuntimeError: Not compiled with GPU support.
Read more >
nvidia/cuda - Docker Image
After it has been determined the problem is not with the NVIDIA runtime, report an issue at the CUDA Container Image Issue Tracker....
Read more >
Installation Guide — NVIDIA Cloud Native Technologies ...
On Red Hat Enterprise Linux (RHEL) 8, Docker is no longer a supported container runtime. See Building, Running and Managing Containers for more...
Read more >
Installation — detectron2 0.5 documentation
To rebuild detectron2 that's built from a local clone, use rm -rf build/ **/*.so ... "nvcc not found" or "Not compiled with GPU...
Read more >
Build from source - TensorFlow
Install GPU support (optional, Linux only). There is no GPU support for macOS. Read the GPU support guide to install the drivers and...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found