Calling AzureBlobFileSystem.cat on a file path, adds a "/" to the end of the file path
See original GitHub issueI was using dask + zarr to store arrays on azure blob storage using this library, but I ran into an issue.
What happens is that:
- At some point when loading a zarr file from the storage
fsspec.mapping.FSMap
is created (with.fs
being anAzureBlobFileSystem
instance); - It calls
self.fs.cat(k)
wherek
is a string representing a file path on the blob storage: e.g.my-blob/my-array/.zarray
; andself.fs
is the AzureBlobFileSystem instance; AzureBlobFileSystem.cat
callsAzureBlobFileSystem._expand_path
at some point which runs this line:
https://github.com/dask/adlfs/blob/3874b3e536fe6b24c824ee096566c8620b623dfa/adlfs/spec.py#L1351
- The thing returns:
my-blob/my-array/.zarray/
and later on the loading crashes with the following error:
ResourceNotFoundError: Operation returned an invalid status 'The specified blob does not exist.'
ErrorCode:BlobNotFound
I think the solution is just to make sure we don’t add a “/” in _expand_path
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
Directories and delimiter handling · Issue #562 - GitHub
In comments, these are usually called "pseudo-directories". ... Calling AzureBlobFileSystem.cat on a file path, adds a "/" to the end of the file...
Read more >Quickstart: Azure Blob Storage client library for Python
Uploads the local text file to the blob by calling the upload_blob method. Add this code to the end of the try block:...
Read more >Get last dirname/filename in a file path argument in Bash
I need to get just "example" off the end of the string and then concat it with another string so I can checkout...
Read more >Get the Last Directory or Filename From a File Path - Baeldung
Learn how to extract the last component from a given path string. ... the Linux command line, we often need to handle file...
Read more >Hadoop Azure Support: ABFS — Azure Data Lake Storage Gen2
Azure Blob File System Flush Options; 2. ... To retrieve using shell script, specify the path to the script for the config ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Fixed with #217
Thanks! I do think it’s worth having a longer discussion around the ideal behavior here, about how to consistently handle pseduo-directories in these object stores. Maybe that discussion has happened though, I haven’t followed closely (edit: that’s happening in https://github.com/intake/filesystem_spec/issues/562).
I haven’t tested it yet, but something like this might work
That solves the
exists()
side. Unfortunately things like.open()
will need to be updated as well…