Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

S3 rate limit encountered during DataFrame computation

See original GitHub issue

I was computing the length of a Dask DataFrame which contains many (i.e. several thousand) partitions and is stored in Parquet format on S3. Something that looks like:

import dask.dataframe as dd

df = dd.read_parquet("s3://...", ...)   # df has thousands of partitions
len(df)

Shortly after this computation was kicked off, I got the following error (the full traceback is further down):

ClientError: An error occurred (SlowDown) when calling the ListObjectsV2 operation (reached max retries: 4): Please reduce your request rate.

My guess is that since our len(df) implementation triggers lots of small, fast length computations, we’re able to crank through many tasks quickly and hit some S3 rate limit.

This error message tells me to Please reduce your request rate. and I’m wondering how I can go about doing this. Perhaps there is some exponential backoff behavior I can trigger in s3fs to make api requests at an increasingly slower rate? Maybe there’s other throttling mechanism I can use to prevent hitting these types of S3 limits?

cc @martindurant

Full traceback:

---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
/opt/conda/envs/coiled/lib/python3.9/site-packages/s3fs/core.py in _call_s3()

/opt/conda/envs/coiled/lib/python3.9/site-packages/aiobotocore/client.py in _make_api_call()

ClientError: An error occurred (SlowDown) when calling the ListObjectsV2 operation (reached max retries: 4): Please reduce your request rate.

The above exception was the direct cause of the following exception:

OSError                                   Traceback (most recent call last)
<timed eval> in <module>

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/dask/dataframe/core.py in __len__(self)
   3942             return super().__len__()
   3943         else:
-> 3944             return len(s)
   3945 
   3946     def __contains__(self, key):

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/dask/dataframe/core.py in __len__(self)
    579 
    580     def __len__(self):
--> 581         return self.reduction(
    582             len, np.sum, token="len", meta=int, split_every=False
    583         ).compute()

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/dask/base.py in compute(self, **kwargs)
    284         dask.base.compute
    285         """
--> 286         (result,) = compute(self, traverse=False, **kwargs)
    287         return result
    288 

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/dask/base.py in compute(*args, **kwargs)
    566         postcomputes.append(x.__dask_postcompute__())
    567 
--> 568     results = schedule(dsk, keys, **kwargs)
    569     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    570 

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/distributed/client.py in get(self, dsk, keys, workers, allow_other_workers, resources, sync, asynchronous, direct, retries, priority, fifo_timeout, actors, **kwargs)
   2746                     should_rejoin = False
   2747             try:
-> 2748                 results = self.gather(packed, asynchronous=asynchronous, direct=direct)
   2749             finally:
   2750                 for f in futures.values():

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/distributed/client.py in gather(self, futures, errors, direct, asynchronous)
   2023             else:
   2024                 local_worker = None
-> 2025             return self.sync(
   2026                 self._gather,
   2027                 futures,

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/distributed/client.py in sync(self, func, asynchronous, callback_timeout, *args, **kwargs)
    864             return future
    865         else:
--> 866             return sync(
    867                 self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
    868             )

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/distributed/utils.py in sync(loop, func, callback_timeout, *args, **kwargs)
    324     if error[0]:
    325         typ, exc, tb = error[0]
--> 326         raise exc.with_traceback(tb)
    327     else:
    328         return result[0]

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/distributed/utils.py in f()
    307             if callback_timeout is not None:
    308                 future = asyncio.wait_for(future, callback_timeout)
--> 309             result[0] = yield future
    310         except Exception:
    311             error[0] = sys.exc_info()

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/tornado/gen.py in run(self)
    760 
    761                     try:
--> 762                         value = future.result()
    763                     except Exception:
    764                         exc_info = sys.exc_info()

~/mambaforge/envs/coiled-jrbourbeau-parquet-demo/lib/python3.9/site-packages/distributed/client.py in _gather(self, futures, errors, direct, local_worker)
   1888                             exc = CancelledError(key)
   1889                         else:
-> 1890                             raise exception.with_traceback(traceback)
   1891                         raise exc
   1892                     if errors == "skip":

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/optimization.py in __call__()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/core.py in get()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/core.py in _execute_task()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/core.py in <genexpr>()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/core.py in _execute_task()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/dataframe/io/parquet/core.py in __call__()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/dataframe/io/parquet/core.py in read_parquet_part()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/dataframe/io/parquet/core.py in <listcomp>()

/opt/conda/envs/coiled/lib/python3.9/site-packages/dask/dataframe/io/parquet/fastparquet.py in read_partition()

/opt/conda/envs/coiled/lib/python3.9/site-packages/fastparquet/api.py in to_pandas()

/opt/conda/envs/coiled/lib/python3.9/site-packages/fastparquet/api.py in read_row_group_file()

/opt/conda/envs/coiled/lib/python3.9/site-packages/fsspec/spec.py in open()

/opt/conda/envs/coiled/lib/python3.9/site-packages/s3fs/core.py in _open()

/opt/conda/envs/coiled/lib/python3.9/site-packages/s3fs/core.py in __init__()

/opt/conda/envs/coiled/lib/python3.9/site-packages/fsspec/spec.py in __init__()

/opt/conda/envs/coiled/lib/python3.9/site-packages/fsspec/asyn.py in wrapper()

/opt/conda/envs/coiled/lib/python3.9/site-packages/fsspec/asyn.py in sync()

/opt/conda/envs/coiled/lib/python3.9/site-packages/fsspec/asyn.py in _runner()

/opt/conda/envs/coiled/lib/python3.9/site-packages/s3fs/core.py in _info()

/opt/conda/envs/coiled/lib/python3.9/site-packages/s3fs/core.py in _simple_info()

/opt/conda/envs/coiled/lib/python3.9/site-packages/s3fs/core.py in _call_s3()

OSError: [Errno 16] Please reduce your request rate.