question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

exp: Queuing experiment causes "Unable to acquire lock" error although no lock file exists at first

See original GitHub issue

Bug Report

Description

I am trying to queue an experiment. There is no .dvc/tmp/lock file and no .dvc/tmp/rwlock.lock file or any other lock file in the .dvc/tmp directory before running dvc exp run --queue. During the execution of this command, different lock files are being created (lock, exp_scm_lock, and multiple <sha>.lock files) in .dvc/tmp/. Eventually, the command fails with error message:

ERROR: Unable to acquire lock. Most likely another DVC process is running or was terminated abruptly. Check the page <https://dvc.org/doc/user-guide/troubleshooting#lock-issue> for other possible reasons and to learn how to resolve this.

Note that I am in a larger mono-repo with a local .dvc project folder. Git operations generally seem to take a bit longer and I am just wondering if this has to do with the observed error?

Also, I previously had a queue running and stopped it with dvc queue stop. This is a long time ago (~1h). I since had uninstalled DVC version 2.31.0 and installed a newer version (tried 2.33.2 and 2.34.0, both installed on CentOS 7 using yum).

Downgrading to DVC version 2.31.0 resolved the issue.

I had set core.hardlink_lock = true in .dvc/config previously following the guidance in the error message. This setting made no difference to the observed issue.

Reproduce

Expected

Environment information

Output of dvc doctor:

$ dvc doctor

Additional Information (if any):

Issue Analytics

  • State:open
  • Created 10 months ago
  • Comments:12 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
aschuh-hfcommented, Nov 24, 2022

Might this issue be caused by long running commands issued by the VS Code extension? Observing it again after updating DVC to the latest version 2.35.1.

0reactions
aschuh-hfcommented, Dec 20, 2022

In https://github.com/iterative/dvc/pull/8623, we introduce a double-check to check if the process in .dvc/tmp/rwlock still exists, it can solve this issue.

I’ve seen this PR, but nonetheless I experienced this issue only with DVC versions which include this PR. Could it be that the feature is not working as intended?

Could you please elaborate what do you mean here Downgrading to DVC version 2.31.0 resolved the issue. Does it mean that when you get the error occur, and switch back to version 2.31.0 the command will success, or in 2.31.0 the problem will not be triggered?

By downgrading I mean installing an earlier version after I encountered the issue. I still had to manually delete lock files, I think, but with the older DVC version I did then no longer encounter this issue.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Logstash fails to acquire lock on PQ lockfile (reentrant?) #10572
These cases indicate that a single Logstash process acquired the lock on the queue, encountered some error and failed to fully release the...
Read more >
Troubleshooting | Data Version Control - DVC
Unable to acquire lock. You may encounter an error message saying Unable to acquire lock if you have another DVC process running in...
Read more >
988 - Stack Overflow
lock': File exists. If no other git process is currently running, this probably means a git process crashed in this repository earlier. Make ......
Read more >
What is SKIP LOCKED for in PostgreSQL 9.5? - EDB
Whoops, your “concurrent” queue is serialized. If the first xact to get the lock rolls back the second gets the row and tries...
Read more >
File Locking - GitLab Docs
After a file type has been registered as lockable, Git LFS makes them read-only on the file system automatically. This means you must...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found