question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Paths joined as strings

See original GitHub issue

Looking through the code surrounding dataset downloads, I noticed that in get_local_storage_path two path-representing strings are joined via an f string:

https://github.com/activeloopai/deeplake/blob/392f8e92235b29ffa8bd59ee7d0969c7d4bbc2ee/deeplake/util/storage.py#L170

Are there reasons why these are not Paths? I would have expected the following:

from pathlib import Path

def get_local_storage_path(path: str, prefix: Path) -> Path:
    local_cache_name = path.replace("://", "_")
    local_cache_name = local_cache_name.replace("/", "_")
    local_cache_name = local_cache_name.replace("\\", "_")
    return prefix / local_cache_name

Or at least:

import os

def get_local_storage_path(path: str, prefix: str) -> str:
    local_cache_name = path.replace("://", "_")
    local_cache_name = local_cache_name.replace("/", "_")
    local_cache_name = local_cache_name.replace("\\", "_")
    return os.path.join(prefix, local_cache_name)

By the way, I noticed this because I got a string/path with two slashes (//) from get_local_dataset:

deeplake.util.exceptions.DatasetHandlerError: A dataset does not exist at the download location /tmp/deeplake//hub_activeloop_fashionpedia-train. Cannot use access method 'local'. Use access method 'download' to first download the dataset and then use access method 'local' in subsequent runs.

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:8

github_iconTop GitHub Comments

1reaction
FayazRahmancommented, Oct 31, 2022

@jangop Hey there, thanks for raising the issue. There’s no reason for joining the paths as strings and we should do it the way you suggested.

0reactions
FayazRahmancommented, Nov 7, 2022

One would not need to switch anything for the typical usecase “download if not downloaded; use local if available”.

Yep, there is no need to switch in this case with the new implementation. Just use access_method="local". Only ever need to use access_method="download" for re-downloading the dataset.

Just looking at the link you shared, it appears that the second point for local is now wrong:

Oops, that’s a mistake in the docs, thanks for pointing it out.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Path.Join Method (System.IO) - Microsoft Learn
The Join method concatenates the two strings and preserves duplicate path separators. The Combine method abandons the drive and returns a rooted directory...
Read more >
Python | os.path.join() method - GeeksforGeeks
This method concatenates various path components with exactly one directory separator ('/') following each non-empty part except the last path ...
Read more >
python - Platform independent path concatenation using "/" , "\"?
You want to use os.path.join() for this. The strength of using this rather than string concatenation etc is that it is aware of...
Read more >
Python | Join List as Path - Finxter
Create a string by joining all path components using the os.path.join(…) method. You unpack all strings in the list using the asterisk operator....
Read more >
Mistakes I've made treating file paths as strings - Phil Nash
So you can treat them like strings, joining them or concatenating them until you pass ... Use path.join whenever you have to join...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found