Stopping a run from the Wandb UI doesn't work in Docker
See original GitHub issue- Weights and Biases version: 0.8.13
- Python version: 3.6.8
- Operating System: Ubuntu 16.04
Description
I would like to stop a run from the Wandb UI, where there is an option to “stop run” on the run overview page. This works for experiments started from the command line. But when I have an experiment running in a docker container, it doesn’t work.
I have started an experiment in the container with something like (kubernetes-config-style):
command: [ "/bin/bash", "-c", "--" ]
args:
- "cd /my_code && python3 -m my_experiment_module"
When I hit the “stop run” button, I don’t see any output in the container logs - it just keeps running like nothing happened.
I know a common issue with docker containers is that you’ll send a signal to the container, but it’ll get eaten by bash and not passed on to your program. But here it doesn’t seem like that would be the issue.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Web Interface "Stop run" does not work · Issue #648 · wandb ...
I saw the project appear in the wandb web app, but when I selected "Stop run" under the context menu (using the three...
Read more >Basic Setup - Documentation - Weights & Biases - Wandb
Run Weights and Biases on your own machines using Docker. ... If you run out of disk space, the instance will stop working,...
Read more >Environment Variables - Documentation - Weights & Biases
The command wandb offline sets an environment variable, WANDB_MODE=offline . This stops any data from syncing from your machine to the remote wandb...
Read more >Advanced Configuration - Documentation - Weights & Biases
Go to Settings > CORS > Blob service, and enter the IP of your wandb server as an allowed origin, with allowed methods...
Read more >wandb docker - Documentation - Weights & Biases
W&B docker lets you run your code in a docker image ensuring wandb is configured. It adds the WANDB_DOCKER and WANDB_API_KEY environment variables...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks for the report, we will try to reproduce using your instructions. We should be sending a signal from a spawned process back to the parent process (your experiment module) but perhaps we will need to do something slightly different in the environment you created.
Basically I’ve just been using
wandb-docker-run -d myImage
to run a few runs with wandb. But when trying to stop one of them through the web interface, the container never seemed to receive this stop signal and simply continued with the run without any indication in the logs.