question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`dvc exp run --run-all`: only runs one experiment and then hangs

See original GitHub issue

Bug Report

Description

When running dvc exp run --run-all, only one experiment gets run fully, and then the next experiment never starts.

Reproduce

  1. Add multiple experiments to the queue with dvc exp run --queue
  2. dvc exp run --run-all

Expected

All experiments should run sequentially.

Environment information

$ dvc doctor
DVC version: 2.18.1 (pip)
---------------------------------
Platform: Python 3.10.4 on Linux-5.4.0-80-generic-x86_64-with-glibc2.31
Supports:
        http (aiohttp = 3.8.1, aiohttp-retry = 2.5.5),
        https (aiohttp = 3.8.1, aiohttp-retry = 2.5.5),
        webhdfs (fsspec = 2022.7.1)
Cache types: hardlink, symlink
Cache directory: nfs4 on <REDACTED>
Caches: local
Remotes: None
Workspace directory: nfs4 on <REDACTED>
Repo: dvc, git

Additional information

There seem to be no workers active even though there should? This is the output after starting the queue after the failed exp run --run-all previously mentioned

$ dvc queue start
Started '1' new experiments task queue worker.
$ dvc queue status
Task     Name       Created       Status
c79d124             Aug 22, 2022  Running
eb3633a             Aug 22, 2022  Queued
3dcf13c             Aug 22, 2022  Queued
11a6ea4             Aug 22, 2022  Queued
a2f2386             Aug 22, 2022  Queued
a49a3bf             Aug 22, 2022  Queued
5344ab1             Aug 22, 2022  Queued
4ca8ed7             Aug 22, 2022  Queued
8b0ba0b             Aug 22, 2022  Queued
cb7d0d0  exp-03827  Aug 22, 2022  Success

Worker status: 0 active, 0 idle

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:7 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
JohnTheDeerecommented, Aug 29, 2022

@karajan1001 I updated dvc and now everything is running like a charm! Thanks

1reaction
JohnTheDeerecommented, Aug 28, 2022

Hey @karajan1001 thanks for the quick response.

I get 0.1.0. I will try to update and see if this resolves the issue - thanks!

Read more comments on GitHub >

github_iconTop Results From Across the Web

exp run | Data Version Control - DVC
Provides a way to execute and track experimentsexperiments in your projectproject without polluting it with unnecessary commits, branches, directories, etc.
Read more >
End-to-End Computer Vision API, Part 2: Local Experiments
In part 1, we talked about effective management and versioning of large ... dvc exp run Running stage 'check_packages': > pipenv run pip ......
Read more >
Tuning Hyperparameters with Reproducible Experiments
You should be able to open your terminal and run an experiment with the following command. $ dvc exp run.
Read more >
Managing Machine Learning Experiments with DVC - YouTube
In this video, Milecia McGregor demonstrates how you can use DVC 2.0 to ... experiments locally ○ Run all of them in parallel...
Read more >
Release 3.15.1 Gev Sogomonian, Gor Arakelyan et al. - Aim
Running Aim UI and tracking server inside Docker container . ... 18 Track experiments with aim remote server. 77. 18.1 Overview .
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found