AzureBlobFileSystem hanging on read/list operations
See original GitHub issueWhat happened:
I’m trying to use AzureBlobFileSystem to generate a MutableMapping object that will work with zarr. I can create my file system object but all operations thereafter hang indefinitely. The traceback, shown below, indicates this could have something to do with the recent async work in fsspec. cc @martindurant on that point.
What you expected to happen:
I expected the filesystem to produce blobs. Here’s an example where I simply try to retrieve the bytes of a know blob (not using the mapper interface).
Minimal Complete Verifiable Example:
from adlfs import AzureBlobFileSystem
path = 'carbonplan-data/raw/terraclimate/4000m/raster.zarr/.zgroup'
fs = AzureBlobFileSystem(
account_name="carbonplan",
account_key=os.environ["BLOB_ACCOUNT_KEY"]
)
fs.cat(path)
Interrupting this code shows the following traceback:
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-48-b68548dcfea1> in <module>
7 )
8
----> 9 fs.cat(path)
/srv/conda/envs/notebook/lib/python3.7/site-packages/fsspec/asyn.py in cat(self, path, recursive, **kwargs)
218
219 def cat(self, path, recursive=False, **kwargs):
--> 220 paths = self.expand_path(path, recursive=recursive)
221 out = sync(self.loop, self._cat, paths, **kwargs)
222 if (
/srv/conda/envs/notebook/lib/python3.7/site-packages/adlfs/spec.py in expand_path(self, path, recursive, maxdepth)
1081
1082 def expand_path(self, path, recursive=False, maxdepth=None):
-> 1083 return sync(self.loop, self._expand_path, path, recursive, maxdepth)
1084
1085 async def _expand_path(self, path, recursive=False, maxdepth=None):
/srv/conda/envs/notebook/lib/python3.7/site-packages/fsspec/asyn.py in sync(loop, func, callback_timeout, *args, **kwargs)
62 else:
63 while not e.is_set():
---> 64 e.wait(10)
65 if error[0]:
66 typ, exc, tb = error[0]
/srv/conda/envs/notebook/lib/python3.7/threading.py in wait(self, timeout)
550 signaled = self._flag
551 if not signaled:
--> 552 signaled = self._cond.wait(timeout)
553 return signaled
554
/srv/conda/envs/notebook/lib/python3.7/threading.py in wait(self, timeout)
298 else:
299 if timeout > 0:
--> 300 gotit = waiter.acquire(True, timeout)
301 else:
302 gotit = waiter.acquire(False)
KeyboardInterrupt:
Anything else we need to know?:
Some related conversation here: https://github.com/zarr-developers/zarr-python/issues/618
Environment:
- Dask version: 2.25.0
- Python version: 3.7.8
- Operating System: Linux
- Install method (conda, pip, source): conda
Issue Analytics
- State:
- Created 3 years ago
- Comments:10 (3 by maintainers)
Top Results From Across the Web
ABSStore ImportError #618 - zarr-developers/zarr-python
from adlfs import AzureBlobFileSystem path ... AzureBlobFileSystem hanging on read/list operations fsspec/adlfs#112.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
When working off master, the above described issue is resolved. @hayesgb - feel free to close or leave open if you want to address @martindurant’s
maybe_sync
comments first. Thanks for the quick fix!Both will work when called from non-async code, but maybe_sync will do the right thing wherever its called from. You should make sure to have some async tests like https://github.com/intake/filesystem_spec/blob/master/fsspec/implementations/tests/test_http.py#L239