gc: fails when attempting to remove cache shared by multiple projects
See original GitHub issueBug Report
Issue name
gc: fails when attempting to remove cache shared by multiple projects
Description
When attempting to garbage collect files shared by multiple projects dvc throws an error saying it is attempting to write a read only file.
Reproduce
I don’t have multiple dvc repos to reproduce on
Expected
dvc performs gc as normal
Environment information
Output of dvc doctor
:
$ dvc doctor
DVC version: 2.9.5 (pip)
---------------------------------
Platform: Python 3.8.1 on Linux-5.17.5-76051705-generic-x86_64-with-glibc2.10
Supports:
azure (adlfs = 2022.2.0, knack = 0.9.0, azure-identity = 1.8.0),
webhdfs (fsspec = 2022.2.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2022.2.0, boto3 = 1.20.24)
Cache types: reflink, hardlink, symlink
Cache directory: xfs on /dev/mapper/fastdata-fastlv
Caches: local
Remotes: local, s3, local
Workspace directory: xfs on /dev/mapper/fastdata-fastlv
Repo: dvc, git
Additional Information (if any):
$ dvc gc -v -w -p . ../../dcdanko/bdx2 ../../papciak/Biotia-DX/ ../../tpaisie/bdx/ ../../ahmadazim/Biotia-DX/ ../../hwells/Biotia-DX/
2022-08-03 11:31:02,132 WARNING: This will remove all cache except items used in the workspace of the current and the following repos:
- /mnt/fast/dev/dcdanko/bdx1
- /mnt/fast/dev/dcdanko/bdx2
- /mnt/fast/dev/papciak/Biotia-DX
- /mnt/fast/dev/tpaisie/bdx
- /mnt/fast/dev/ahmadazim/Biotia-DX
- /mnt/fast/dev/hwells/Biotia-DX
Are you sure you want to proceed? [y/n]: y
2022-08-03 11:31:04,244 ERROR: unexpected error - attempt to write a readonly database
------------------------------------------------------------
Traceback (most recent call last):
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/cli/__init__.py", line 78, in main
ret = cmd.do_run()
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/cli/command.py", line 22, in do_run
return self.run()
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/commands/gc.py", line 51, in run
self.repo.gc(
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/repo/__init__.py", line 48, in wrapper
return f(repo, *args, **kwargs)
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/repo/gc.py", line 53, in gc
all_repos = [Repo(path) for path in repos]
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/repo/gc.py", line 53, in <listcomp>
all_repos = [Repo(path) for path in repos]
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/repo/__init__.py", line 202, in __init__
self.state = State(self.root_dir, state_db_dir, self.dvcignore)
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/dvc/state.py", line 65, in __init__
self.links = Cache(directory=os.path.join(tmp_dir, "links"), **config)
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/diskcache/core.py", line 478, in __init__
self.reset(key, value, update=False)
File "/home/dcdanko/miniconda/envs/bdx1/lib/python3.8/site-packages/diskcache/core.py", line 2433, in reset
((old_value,),) = sql(
sqlite3.OperationalError: attempt to write a readonly database
------------------------------------------------------------
2022-08-03 11:31:05,655 DEBUG: Removing '/mnt/fast/dev/dcdanko/.RKoWhSFAMKbZQvEyT5Twwi.tmp'
2022-08-03 11:31:05,656 DEBUG: Removing '/mnt/fast/dev/dcdanko/.RKoWhSFAMKbZQvEyT5Twwi.tmp'
2022-08-03 11:31:05,656 DEBUG: Removing '/mnt/fast/dev/dcdanko/.RKoWhSFAMKbZQvEyT5Twwi.tmp'
2022-08-03 11:31:05,656 DEBUG: Removing '/fast/bdx/.shared_dvc_cache/.6xypQvximg96enbwqfa4tN.tmp'
2022-08-03 11:31:05,674 DEBUG: Version info for developers:
DVC version: 2.9.5 (pip)
---------------------------------
Platform: Python 3.8.1 on Linux-5.17.5-76051705-generic-x86_64-with-glibc2.10
Supports:
azure (adlfs = 2022.2.0, knack = 0.9.0, azure-identity = 1.8.0),
webhdfs (fsspec = 2022.2.0),
http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6),
s3 (s3fs = 2022.2.0, boto3 = 1.20.24)
Cache types: reflink, hardlink, symlink
Cache directory: xfs on /dev/mapper/fastdata-fastlv
Caches: local
Remotes: local, s3, local
Workspace directory: xfs on /dev/mapper/fastdata-fastlv
Repo: dvc, git
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2022-08-03 11:31:05,676 DEBUG: Analytics is disabled.
Issue Analytics
- State:
- Created a year ago
- Comments:10 (3 by maintainers)
Top Results From Across the Web
DVC gc issues with shared cache and remote for several repos
I'm using DVC for two projects sharing the same remote and cache. I would like to clean the remote and the cache in...
Read more >How can I make one single `.gradle` cache for multiple projects?
Because of the locking mechanism Gradle uses for its dependency cache, you can't have multiple instances write to the same cache directory.
Read more >Caching Dependencies - CircleCI
This document is a guide to caching dependencies in CircleCI pipelines. ... Each cache key is namespaced to the project and retrieval is...
Read more >Build Cache - Gradle User Manual
When using a shared build cache for task output caching this even works across ... Gradle will try to reuse outputs from previous...
Read more >Cache Implementations in C# .NET | Michael's Coding Spot
High memory consumption can lead to GC Pressure (aka Memory Pressure). ... FromSeconds(2)) // Remove from cache after this time, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@dcdanko thank you! I’ve followed the steps from the doc you shared and have set up a separate caching directory.
On top of that, it was required to adjust permissions for GID inheritance (
chmod u=rwx,g=rwx,o=,g+s ~/dvc-cache/
) and usedvc config cache.type copy
so that the files can be editable within my setup. My issue is resolved now.UPD: I am sorry, meant to tag @daavoo
Thanks, I just tried with dvc 2.18.1 and the error persists