question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Datasets cache folder not shared between users

See original GitHub issue

Hello, a discussion on this issue started on Slack here.

I have a ClearML server hosted on AWS with web authentication enabled. Each ML person has:

  • its own user/pass used to log in to the ClearML server
  • its own set of ClearML API credentials
  • its own set of AWS credentials
  • its own clearml.conf file in the home

The config file defines the path to the cache folder via:

storage {
        cache {
            # Defaults to system temp folder / cache
            default_base_dir: "/scratch/clearml-cache"
        }

We have some datasets registered by the ClearML server and the codebase uses get_local_copy() to download the data into the machine. The problem manifests when two or more people wants to access (read, i.e. the cache exists already and isn’t corrupted) the dataset.

The execution fails with this error:

Traceback (most recent call last):
...
    path = Dataset.get(dataset_name=dataset_name,
  File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/datasets/dataset.py", line 567, in get_local_copy
    target_folder = self._merge_datasets(
  File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/datasets/dataset.py", line 1387, in _merge_datasets
    target_base_folder = self._create_ds_target_folder(
  File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/datasets/dataset.py", line 1336, in _create_ds_target_folder
    cache.lock_cache_folder(local_folder)
  File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/storage/cache.py", line 273, in lock_cache_folder
    lock.acquire(timeout=0)
  File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/utilities/locks/utils.py", line 130, in acquire
    fh = self._get_fh()
  File "/scratch/people/gf/miniconda3/envs/py38-torch/lib/python3.8/site-packages/clearml/utilities/locks/utils.py", line 205, in _get_fh
    return open(self.filename, self.mode, **self.file_open_kwargs)
PermissionError: [Errno 13] Permission denied: '/scratch/clearml-cache/storage_manager/datasets/.lock.000.ds_5f1f42f430b042cfb213e8099cda00b4.clearml'

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
jkhenningcommented, Sep 12, 2022

Hi @mralgos, thanks for the contribution! Closing this issue 🙂

1reaction
eugen-ajechiloae-clearmlcommented, Jul 5, 2022

Hi @mralgos! Can you please open a PR for your fix? I think it looks good

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cache management - Hugging Face
Change the cache directory. Control how a dataset is loaded from the cache. Clean up cache files in the directory. Enable or disable...
Read more >
Only user permission of saved cache files, not group #2065
Hello,. It seems when a cached file is saved from calling dataset.map for preprocessing, it gets the user permissions and none of the...
Read more >
Manage download settings for shared mail folders in Cached ...
Provides details on how to disable the download of shared mail folders in Cached Exchange mode in Outlook 2013 and later versions.
Read more >
Shared cache directory - Questions | Data Version Control - DVC
Hi I was wondering how to setup a shared cache directory, if possible? ... the data lives in a separate dir that is...
Read more >
Shared App Groups cache not writable - Apple Developer
I'm trying to share cached data between a tvOS app and a TVService extension. I crated an app group and get a directory...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found