question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Disallowing ListObjectsV2 at the root of the bucket makes s3fs attempt to create a bucket

See original GitHub issue

What happened:

It’s not uncommon to have a bucket that disallows listing the root, but allows listing a specific prefix. In this case s3fs will fail any writes and will attempt to create the bucket, which often fails with a completely different error.

What you expected to happen:

Falling back to creating a bucket is very strange behaviour. I imagine it’s legacy and impossible to change, but I would expect that s3fs does not require full list objects permissions over the bucket to perform any writes.

Minimal Complete Verifiable Example:

In [1]: import s3fs

In [2]: s3 = s3fs.S3FileSystem(anon=False)

In [5]: s3.mkdirs("s3://s3fs-test-bucket-123/foo/bar")
2021-09-17 17:34:50,222 - s3fs - DEBUG - _call_s3 -- CALL: list_objects_v2 - () - {'MaxKeys': 1, 'Bucket': 's3fs-test-bucket-123'}
2021-09-17 17:34:50,516 - s3fs - DEBUG - _call_s3 -- Nonretryable error: An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied
2021-09-17 17:34:50,516 - s3fs - DEBUG - _call_s3 -- CALL: create_bucket - () - {'Bucket': 's3fs-test-bucket-123', 'ACL': ''}
2021-09-17 17:34:50,576 - s3fs - DEBUG - _call_s3 -- Nonretryable error: An error occurred (IllegalLocationConstraintException) when calling the CreateBucket operation: The unspecified location constraint is incompatible for the region specific endpoint this request was sent to.

The full traceback is like so:

File "/home/app/.cache/pypoetry/virtualenvs/x/lib/python3.9/site-packages/dask/dataframe/io/parquet/arrow.py", line 819, in initialize_write
    fs.mkdirs(path, exist_ok=True)
  File "/home/app/.cache/pypoetry/virtualenvs/x/lib/python3.9/site-packages/fsspec/spec.py", line 1159, in mkdirs
    return self.makedirs(path, exist_ok=exist_ok)
  File "/home/app/.cache/pypoetry/virtualenvs/x/lib/python3.9/site-packages/fsspec/asyn.py", line 88, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/home/app/.cache/pypoetry/virtualenvs/x/lib/python3.9/site-packages/fsspec/asyn.py", line 69, in sync
    raise result[0]
  File "/home/app/.cache/pypoetry/virtualenvs/x/lib/python3.9/site-packages/fsspec/asyn.py", line 25, in _runner
    result[0] = await coro
  File "/home/app/.cache/pypoetry/virtualenvs/x/lib/python3.9/site-packages/s3fs/core.py", line 731, in _makedirs
    await self._mkdir(path, create_parents=True)
  File "/home/app/.cache/pypoetry/virtualenvs/x/lib/python3.9/site-packages/s3fs/core.py", line 716, in _mkdir
    await self._call_s3("create_bucket", **params)

It seems like it’s failing to detect the bucket exists on this line. There are much better methods to detect if a bucket exists, like get-bucket-location.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:11 (11 by maintainers)

github_iconTop GitHub Comments

1reaction
martindurantcommented, Sep 17, 2021

why don’t we use MaxKeys=0 as well

I think this was designed for sub-directories, where we want to distinguish whether it’s the actual path which exists or a sub-path. That wouldn’t apply to buckets, though. So MaxKeys could depend on the context.

0reactions
isidenticalcommented, Sep 17, 2021

While this certainly adds an extra request, this does seem to be in the error-path rather than the happy-path, so it’s surely not much of a big deal?

Agree with @martindurant that it would reduce the burden, but still costful one some cases.

And I think the results are also cached?

No, we don’t cache the results.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Disallowing ListObjectsV2 at the root of the bucket makes s3fs ...
In this case s3fs will fail any writes and will attempt to create the bucket, which often fails with a completely different error....
Read more >
AWS S3: The bucket you are attempting to access must be ...
I'm using iOS SDK, and in the credential provider, there's a parameter where you can set the region. I've set that to the...
Read more >
ListObjectsV2 - Amazon Simple Storage Service
Returns some or all (up to 1000) of the objects in a bucket with each request. You can use the request parameters as...
Read more >
s3fs(1) — Arch manual pages
s3fs is a FUSE filesystem that allows you to mount an Amazon S3 bucket as a local ... s3fs is always using DNS...
Read more >
Troubleshoot the 403 Forbidden error when uploading files ...
I'm trying to upload files to my Amazon Simple Storage Service (Amazon S3) bucket using the Amazon S3 console. However, I'm getting a...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found