Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Unable to run the voltaml/volta_diffusion:v0.1 docker image

See original GitHub issue

-> % sudo docker run -it --gpus all voltaml/volta_diffusion:v0.1 bash
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: mount error: file creation failed: /var/lib/docker/overlay2/e049fdb3bc56fecdeefb3b950034cbc757eeb166b152330d00ef6e8a2972af06/merged/usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: file exists: unknown.
ERRO[0000] error waiting for container: context canceled

This is probably because when --gpus=all is specified, the Docker engine will try and mount all the nvidia & cuda bits & pieces into the container. But some of the files in the image (e.g. /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1) are actually links rather than files, so the mounting process is not successful.

Please can you open source the Dockerfile as well.

Issue Analytics

State:
Created 10 months ago
Reactions:1
Comments:5 (2 by maintainers)

Top GitHub Comments

3reactions

Pop115commented, Nov 25, 2022

Same issue here, found an issue related to this on nvidia-docker repo https://github.com/NVIDIA/nvidia-docker/issues/1551

I made a Dockerfile containing this

RUN rm -rf /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 /usr/lib/x86_64-linux-gnu/libcuda.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1 /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1 /usr/lib/x86_64-linux-gnu/libnvcuvid.so.1

and executed it with docker build -t voltaml/volta_diffusion -f Dockerfile .

And it seems to work

1reaction

JackCloudmancommented, Nov 25, 2022

Download this file https://gist.github.com/JackCloudman/7143c7aeaafa54ed35b3f6cfe8a30c57

docker build -t voltaml/volta_diffusion:v0.1 -f Dockerfile .
docker run -it --gpus=all -p "8888:8888" voltaml/volta_diffusion:v0.1 jupyter lab --port=8888 --no-browser --ip 0.0.0.0 --allow-root

Top Results From Across the Web

Run your image as a container - Docker Documentation

To run an image inside of a container, we use the docker run command. It requires one parameter and that is the image...

docker run - Docker Documentation

The docker run command first creates a writeable container layer over the specified image, and then starts it using the specified command.

Image Access Management - Docker Documentation

This feature allows Organization owners to control which types of images (Docker Official Images, Docker Verified Publisher Images, Community images) their ...

Run your image as a container - Docker Documentation

To run an image inside of a container, we use the docker run command. The docker run command requires one parameter and that...

How to Fix and Debug Docker Containers Like a Superhero

Container errors are tricky to diagnose, but some investigative magic works wonders. Read along to learn how to debug Docker containers.