question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

S3FileSystem().exists throws Forbidden when file doesn't exists

See original GitHub issue

I’m using Python’s s3fs library to check if a particular file exists in s3 with s3fs.S3FileSystem().exists(path), but I’m getting a Forbidden exception. From the stack trace, I can see it fails when calling s3’s head_object method. The documentation for head_object method says:

If the object you request does not exist, the error Amazon S3 returns depends on whether you also have the s3:ListBucket permission.

  • If you have the s3:ListBucket permission on the bucket, Amazon S3 returns an HTTP status code 404 (“no such key”) error.
  • If you don’t have the s3:ListBucket permission, Amazon S3 returns an HTTP status code 403 (“access denied”) error.

And I have the following task policy attached to my Fargate task:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:ListObjects",
                "s3:ListBucket",
                "s3:HeadObject"
            ],
            "Resource": [
                "arn:aws:s3:::my-bucket/*"
            ]
        }
    ]
}

I’ve checked that the fargate task is indeed with this policy, but botocore keeps returning 403 instead of 404, even when I have the s3:ListBucket permission. Has anyone experienced this problem?

Maybe s3fs should consider 403 as the file not existing as well.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
martindurantcommented, Feb 7, 2022

To others, yes please do raise your hands if this has happened to you, on fargate or otherwise.

@ianliu “Maybe s3fs should consider 403 as the file not existing as well.”

I’m not convinced by this - the user generally needs to know the difference between something not existing and something that might exist if they had different permissions.

1reaction
martindurantcommented, Feb 7, 2022

The documentation for head_object is explicit in saying that both cases the file doesn’t exist. Can forbidden be raised in another scenario?

That is not my reading. If you see a 404, the file does not exist. If you see a 403, the file may or may not exist, but you do not have permission to know which. It is important for AWS not to leak the possible existence of files to non-privileged users.

Read more comments on GitHub >

github_iconTop Results From Across the Web

API — S3Fs 2022.11.0+4.g5917684 documentation
S3FileSystem.exists (path) ... Return a file-like object from the filesystem ... If given, the default block size value used for open() , if...
Read more >
How to read partitioned parquet files from S3 using pyarrow in ...
I am limited to use a ECS cluster, hence spark/pyspark is not an option. Is there a way we can easily read the...
Read more >
com.amazonaws.services.s3.model.AmazonS3Exception
isEmpty() ) ) { injectType( FileType.FOLDER ); } else { //Folders don't really exist - they will generate a "NoSuchKey" exception // confirms...
Read more >
AmazonS3 (AWS SDK for Java - 1.12.368)
Otherwise, the method will throw an AmazonServiceException with status code '404 Not Found' if the bucket does not exist, '403 Forbidden' if the...
Read more >
Apache Spark with Amazon S3 Examples - Supergloo -
Spark S3 tutorial for accessing files stored on Amazon S3 from Apache Spark. ... InvalidInputException: Input path does not exist: ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found