`dvc exp run --run-all`: only runs one experiment and then hangs
See original GitHub issueBug Report
Description
When running dvc exp run --run-all
, only one experiment gets run fully, and then the next experiment never starts.
Reproduce
- Add multiple experiments to the queue with
dvc exp run --queue
dvc exp run --run-all
Expected
All experiments should run sequentially.
Environment information
$ dvc doctor
DVC version: 2.18.1 (pip)
---------------------------------
Platform: Python 3.10.4 on Linux-5.4.0-80-generic-x86_64-with-glibc2.31
Supports:
http (aiohttp = 3.8.1, aiohttp-retry = 2.5.5),
https (aiohttp = 3.8.1, aiohttp-retry = 2.5.5),
webhdfs (fsspec = 2022.7.1)
Cache types: hardlink, symlink
Cache directory: nfs4 on <REDACTED>
Caches: local
Remotes: None
Workspace directory: nfs4 on <REDACTED>
Repo: dvc, git
Additional information
There seem to be no workers active even though there should? This is the output after starting the queue after the failed exp run --run-all
previously mentioned
$ dvc queue start
Started '1' new experiments task queue worker.
$ dvc queue status
Task Name Created Status
c79d124 Aug 22, 2022 Running
eb3633a Aug 22, 2022 Queued
3dcf13c Aug 22, 2022 Queued
11a6ea4 Aug 22, 2022 Queued
a2f2386 Aug 22, 2022 Queued
a49a3bf Aug 22, 2022 Queued
5344ab1 Aug 22, 2022 Queued
4ca8ed7 Aug 22, 2022 Queued
8b0ba0b Aug 22, 2022 Queued
cb7d0d0 exp-03827 Aug 22, 2022 Success
Worker status: 0 active, 0 idle
Issue Analytics
- State:
- Created a year ago
- Comments:7 (1 by maintainers)
Top Results From Across the Web
exp run | Data Version Control - DVC
Provides a way to execute and track experimentsexperiments in your projectproject without polluting it with unnecessary commits, branches, directories, etc.
Read more >End-to-End Computer Vision API, Part 2: Local Experiments
In part 1, we talked about effective management and versioning of large ... dvc exp run Running stage 'check_packages': > pipenv run pip ......
Read more >Tuning Hyperparameters with Reproducible Experiments
You should be able to open your terminal and run an experiment with the following command. $ dvc exp run.
Read more >Managing Machine Learning Experiments with DVC - YouTube
In this video, Milecia McGregor demonstrates how you can use DVC 2.0 to ... experiments locally ○ Run all of them in parallel...
Read more >Release 3.15.1 Gev Sogomonian, Gor Arakelyan et al. - Aim
Running Aim UI and tracking server inside Docker container . ... 18 Track experiments with aim remote server. 77. 18.1 Overview .
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@karajan1001 I updated dvc and now everything is running like a charm! Thanks
Hey @karajan1001 thanks for the quick response.
I get
0.1.0
. I will try to update and see if this resolves the issue - thanks!