question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

s3fs does not work with aiobotocore in dask-cloudprovider

See original GitHub issue

Hi,

I am trying to run the following code:

from dask.distributed import Client
from dask_cloudprovider import FargateCluster
from dask import dataframe as df

if __name__ == "__main__":
cluster = FargateCluster()
cluster.adapt()
client = Client(cluster)

HOLDINGS = df.read_parquet(HOLDINGS_URL)

HOLDINGS_URL is my personal s3 bucket, and this code works without using the FargateCluster.

When i do use it, I get the following error:

❯ python main.py
/home/ubuntu/miniconda3/envs/dev/lib/python3.7/contextlib.py:119: UserWarning: Creating your cluster is taking a surprisingly long time. This is likely due to pending resources on AWS. Hang tight!
  next(self.gen)
Traceback (most recent call last):
  File "main.py", line 81, in <module>
    HOLDINGS = df.read_parquet(HOLDINGS_URL)
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/dask/dataframe/io/parquet/core.py", line 224, in read_parquet
    **kwargs
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/dask/dataframe/io/parquet/fastparquet.py", line 178, in read_metadata
    fs, paths, gather_statistics, **kwargs
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/dask/dataframe/io/parquet/fastparquet.py", line 119, in _determine_pf_parts
    elif fs.isdir(paths[0]):
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/s3fs/core.py", line 511, in isdir
    return bool(self._lsdir(path))
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/s3fs/core.py", line 334, in _lsdir
    for i in it:
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/aiobotocore/paginate.py", line 14, in __iter__
    "{self} is an AsyncIterable: use `async for`".format(self=self)
NotImplementedError: <aiobotocore.paginate.AioPageIterator object at 0x7fd87c7b3610> is an AsyncIterable: use `async for`
distributed.client - ERROR - Failed to reconnect to scheduler after 10.00 seconds, closing client
distributed.utils - ERROR -
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/distributed/utils.py", line 663, in log_errors
    yield
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/distributed/client.py", line 1296, in _close
    await gen.with_timeout(timedelta(seconds=2), list(coroutines))
concurrent.futures._base.CancelledError
distributed.utils - ERROR -
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/distributed/utils.py", line 663, in log_errors
    yield
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/distributed/client.py", line 1025, in _reconnect
    await self._close()
  File "/home/ubuntu/miniconda3/envs/dev/lib/python3.7/site-packages/distributed/client.py", line 1296, in _close
    await gen.with_timeout(timedelta(seconds=2), list(coroutines))
concurrent.futures._base.CancelledError

It seems like s3fs doesn’t have support for aiobotocore, unless I am reading this wrong. Does this make these libraries incompatible?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:12 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
jacobtomlinsoncommented, Jan 10, 2020

I was curious to see how much effort it would be to switch to botocore and so have raised #277 as a POC.

0reactions
gvelchurucommented, Jan 14, 2020

#277 works for me!

Read more comments on GitHub >

github_iconTop Results From Across the Web

s3fs does not work with aiobotocore in dask-cloudprovider #276
Hi, I am trying to run the following code: from dask.distributed import Client from dask_cloudprovider import FargateCluster from dask ...
Read more >
Bountysource
s3fs does not work with aiobotocore in dask-cloudprovider.
Read more >
s3fs suddenly stopped working in Google Colab with error ...
This morning, when I run the same code, I get the following error. enter image description here. It appears to be a new...
Read more >
s3fs.core — S3Fs 2022.11.0 documentation
Note that in the event that you only need to work with the latest version of objects in a versioned bucket, and do...
Read more >
Just a heads up - fsspec has an (optional) dependency on ...
Just a heads up - fsspec has an (optional) dependency on s3fs which has a requirement on aiobotocore, which in turn is currently...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found