Broken `abfs://` on version 2022.7.0
See original GitHub issueI’m running the following code to access an Azure blob storage container:
import adlfs
fs = adlfs.AzureBlobFileSystem()
fs.ls("abfs://my-container-name")
This works perfectly with fsspec==2022.5.0
and adlfs==2022.7.0
. However with fsspec==2022.7.0
and adlfs==2022.7.0
I get FileNotFoundError
arising from azure.core.exceptions.ResourceNotFoundError: The specified container does not exist.
. It does work, however, if I run:
fs.ls("az://my-container-name")
Expectation: abfs://...
syntax should be supported on Python environments containing fsspec==2022.7.0
.
Environment:
- Platform: Ubuntu Linux
- Python: 3.9
- Credentials provided using
AZURE_STORAGE_CONNECTION_STRING
environment variable.
Working Python environment (from pip freeze
):
adal==1.2.7
adlfs==2022.7.0
aiohttp==3.8.1
aiosignal==1.2.0
async-timeout==4.0.2
attrs==21.4.0
azure-core==1.24.2
azure-datalake-store==0.0.52
azure-identity==1.10.0
azure-storage-blob==12.13.0
certifi @ file:///opt/conda/conda-bld/certifi_1655968806487/work/certifi
cffi==1.15.1
charset-normalizer==2.1.0
cryptography==37.0.4
frozenlist==1.3.0
fsspec==2022.5.0
idna==3.3
isodate==0.6.1
msal==1.18.0
msal-extensions==1.0.0
msrest==0.7.1
multidict==6.0.2
oauthlib==3.2.0
portalocker==2.5.1
pycparser==2.21
PyJWT==2.4.0
python-dateutil==2.8.2
requests==2.28.1
requests-oauthlib==1.3.1
six==1.16.0
treelite==2.0.0
treelite-runtime==2.0.0
typing_extensions==4.3.0
urllib3==1.26.11
yarl==1.7.2
Broken Python environment:
adal==1.2.7
adlfs==2022.7.0
aiohttp==3.8.1
aiosignal==1.2.0
async-timeout==4.0.2
attrs==21.4.0
azure-core==1.24.2
azure-datalake-store==0.0.52
azure-identity==1.10.0
azure-storage-blob==12.13.0
certifi @ file:///opt/conda/conda-bld/certifi_1655968806487/work/certifi
cffi==1.15.1
charset-normalizer==2.1.0
cryptography==37.0.4
frozenlist==1.3.0
fsspec==2022.7.0
idna==3.3
isodate==0.6.1
msal==1.18.0
msal-extensions==1.0.0
msrest==0.7.1
multidict==6.0.2
oauthlib==3.2.0
portalocker==2.5.1
pycparser==2.21
PyJWT==2.4.0
python-dateutil==2.8.2
requests==2.28.1
requests-oauthlib==1.3.1
six==1.16.0
treelite==2.0.0
treelite-runtime==2.0.0
typing_extensions==4.3.0
urllib3==1.26.11
yarl==1.7.2
Issue Analytics
- State:
- Created a year ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
Potential bug in fsspec.utils.infer_storage_options · Issue #45 - GitHub
If you were to just add them to the list of protocols, that would break. ... Broken abfs:// on version 2022.7.0 fsspec/filesystem_spec#1002.
Read more >The AI Search Engine You Control
You.com is an ad-free, private search engine that you control. Customize search results with 150 apps alongside web results. Access a zero-trace private ......
Read more >fsspec Documentation - Read the Docs
Starting in version 0.7.5, we provide async operations for some methods of some implementations. Async support in storage implementations is ...
Read more >ServerWebInputException: 400 BAD_REQUEST "Failed to read ...
Broken `abfs://` on version 2022.7.0, 6, 2022-07-28, 2022-12-03. Is it required to call gladLoaderLoadVulkan one more time with VK_EXT_debug_utils?
Read more >filesystem_spec - bytemeta
Regression in 2022.7.0 for local file handling with no bytes copy. toby-coleman. toby-coleman CLOSED ... Broken `abfs://` on version 2022.7.0. timsnyder.
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hello @martindurant, Could the cause of this weird behavior be that fsspec/adlfs implements it’s own
_strip_protocol
method inAzureBlobFileSystem
class? Link to codeI suspect that
AzureBlobFileSystem
counts on receivingops = infer_storage_options(path)
as not having the"host"
on joined on"path"
key, as they do :ops["path"] = ops["host"] + ops["path"]
some lines later.But once adding those
"adl", "abfs", "abfss"
to the protocol list ininfer_storage_options
, you already do this join internaly here.@martindurant @toby-coleman @hayesgb Please would it be possible to explain a bit more what has happened here? I don’t know much about the inner workings of fsspec or adlfs, but I think the problem here seems to be not that #988 was wrong but that it was not release in coordination with required changes on the adlfs side.
Context: @SajidAlamQB and I work on Kedro, which relies heavily on fsspec for handling datasets. Two years ago fsspec’s handling of abfs was raised as a possible bug (https://github.com/fsspec/filesystem_spec/issues/256; https://github.com/fsspec/adlfs/issues/45). From reading those (see https://github.com/fsspec/adlfs/issues/45#issuecomment-608689378), it seems that adding
absf
andadl
was indeed the correct thing to do, but needed to be done in coordination with a change toadlfs
.It seems that instead of this change happening, we on Kedro instead rolled our own version of
fsspec.utils.infer_storage_options
which is the same as fsspec’s version but includesabsf
andadl
in the list ofCLOUD_PROTOCOLS
. Since then we have received requests from users to extend this list further (abfss
andgdrive
). These changes have all worked well for our users (I’m not sure why given apparently it should have required a change toadlfs
also?), but as per https://github.com/kedro-org/kedro/issues/1632 it is a bit annoying to maintain the list ofCLOUD_PROTOCOLS
on our side.Hence, if at all possible, we’d really like to go back to using fsspec’s
infer_storage_options
. Is there any way of coordinating changes with the other libraries so that we can make the changes that we’d like toCLOUD_PROTOCOLS
? I understand this might not be easy to achieve, but if it is possible then that would be much appreciated! 🙏 Otherwise we will need to continue maintaining our own version ofinfer_storage_options
.