Datasets cache folder not shared between users
See original GitHub issueHello, a discussion on this issue started on Slack here.
I have a ClearML server hosted on AWS with web authentication enabled. Each ML person has:
- its own user/pass used to log in to the ClearML server
- its own set of ClearML API credentials
- its own set of AWS credentials
- its own clearml.conf file in the home
The config file defines the path to the cache folder via:
storage {
cache {
# Defaults to system temp folder / cache
default_base_dir: "/scratch/clearml-cache"
}
We have some datasets registered by the ClearML server and the codebase uses get_local_copy()
to download the data into the machine. The problem manifests when two or more people wants to access (read, i.e. the cache exists already and isn’t corrupted) the dataset.
The execution fails with this error:
Traceback (most recent call last):
...
path = Dataset.get(dataset_name=dataset_name,
File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/datasets/dataset.py", line 567, in get_local_copy
target_folder = self._merge_datasets(
File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/datasets/dataset.py", line 1387, in _merge_datasets
target_base_folder = self._create_ds_target_folder(
File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/datasets/dataset.py", line 1336, in _create_ds_target_folder
cache.lock_cache_folder(local_folder)
File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/storage/cache.py", line 273, in lock_cache_folder
lock.acquire(timeout=0)
File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/utilities/locks/utils.py", line 130, in acquire
fh = self._get_fh()
File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/utilities/locks/utils.py", line 205, in _get_fh
return open(self.filename, self.mode, **self.file_open_kwargs)
PermissionError: [Errno 13] Permission denied: '/scratch/clearml-cache/storage_manager/datasets/.lock.000.ds_5f1f42f430b042cfb213e8099cda00b4.clearml'
Issue Analytics
- State:
- Created a year ago
- Comments:8 (7 by maintainers)
Top Results From Across the Web
Cache management - Hugging Face
Change the cache directory. Control how a dataset is loaded from the cache. Clean up cache files in the directory. Enable or disable...
Read more >Only user permission of saved cache files, not group #2065
Hello,. It seems when a cached file is saved from calling dataset.map for preprocessing, it gets the user permissions and none of the...
Read more >Manage download settings for shared mail folders in Cached ...
Provides details on how to disable the download of shared mail folders in Cached Exchange mode in Outlook 2013 and later versions.
Read more >Shared cache directory - Questions | Data Version Control - DVC
Hi I was wondering how to setup a shared cache directory, if possible? ... the data lives in a separate dir that is...
Read more >Shared App Groups cache not writable - Apple Developer
I'm trying to share cached data between a tvOS app and a TVService extension. I crated an app group and get a directory...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hi @mralgos, thanks for the contribution! Closing this issue 🙂
Hi @mralgos! Can you please open a PR for your fix? I think it looks good