--gpus option not working on recently updated docker image
See original GitHub issueI have the following YAML workflow:
on:
push:
branches:
- GPU-debug
jobs:
deploy-runner:
runs-on: [ubuntu-latest]
steps:
- uses: iterative/setup-cml@v1
- uses: actions/checkout@v2
- name: Deploy runner on EC2
env:
PERSONAL_ACCESS_TOKEN: ${{ secrets.REPO_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-west-1
run: |
cml-runner \
--repo https://github.com/sergeychuvakin/DVC_CML_sanbox \
--token=$PERSONAL_ACCESS_TOKEN \
--cloud aws \
--cloud-region us-west-1 \
--cloud-type=g3.4xlarge \
--labels=cml-runner \
--idle-timeout 30
model-training:
timeout-minutes: 5000
needs: [deploy-runner]
runs-on: [self-hosted, cml-runner]
container:
image: docker://dvcorg/cml:0-dvc2-base0-gpu
options: --gpus all
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: '3.7'
- name: Train model
env:
repo_token: ${{ secrets.REPO_TOKEN }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: |
nvidia-smi
shell: bash
I face the following error, while image building:
Namely I tried different images:
docker://dvcorg/cml:0-dvc2-base0-gpu
or
docker://dvcorg/cml:0-dvc2-base1-gpu
gave me the same error
When i disabled options --gpus all
- this error was resolved but at the same time nvidia-smi
was not found
Thanks in advance!
Issue Analytics
- State:
- Created 2 years ago
- Reactions:2
- Comments:5 (3 by maintainers)
Top Results From Across the Web
Using GPU from a docker container? - cuda - Stack Overflow
Run Docker container with nvidia driver pre-installed ... I'm running on ubuntu server 14.04 and i'm using the latest cuda (6.0.37 for linux...
Read more >Enabling GPU access with Compose - Docker Documentation
Enabling GPU access with Compose. Compose services can define GPU device reservations if the Docker host contains such devices and the Docker Daemon...
Read more >Docker cannot use GPU even having ENV ... - GitHub
The thing is that I am using Pycharm and I cannot include --gpus option in my run configuration. I tried to add option...
Read more >How to Properly Use the GPU within a Docker Container
First, Make Sure Your Base Machine Has GPU Drivers. You must first install NVIDIA GPU drivers on your base machine before you can...
Read more >Using Your GPU in a Docker Container - Roboflow Blog
The NVIDIA Container Toolkit is the solution to configure your GPU within a Docker container. Follow this step-by-step guide to get started.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
You’re welcome! We won’t ever know what the exact issue was, but at least it’s solved. 🙃
@0x2b3bfa0 yes indeed I cannot reproduce as well. Looks like you’re right - issue was on AWS side. Now it works as expected. Thank you!