S3 list operation requested when file not found
See original GitHub issueWhat happened:
When trying to open a file that does not exist in an AWS S3 bucket, a list operation is requested. In addition, a FileNotFoundError
exception is raised.
What you expected to happen:
I would expect to see a FileNotFoundError
exception being raised only. No list operations should be requested. It seems unnecessary and AWS S3 charges for it.
Minimal Complete Verifiable Example: When running:
import s3fs
storage_options = {
'key': 'my-key',
'secret': 'my-secret',
'use_ssl': False,
'client_kwargs': {
'endpoint_url': 'http://my-url:my-port'
},
}
file_system = s3fs.S3FileSystem(**storage_options)
file_system.open('s3://my-bucket/my_file.txt', 'r')
The output is:
FileNotFoundError: my-bucket/my_file.txt
And my HTTP traffic monitor shows the GET request for the list operation (note the list-type=2
):
GET /my-bucket?list-type=2&prefix=my_file.txt%2F&delimiter=%2F&max-keys=1&encoding-type=url
Anything else we need to know?: The list operation seems to be requested when this line is executed.
Environment:
- Python version: 3.6.9
- Operating System: Ubuntu 18.04.3 LTS
- Install method (conda, pip, source): pip install s3fs
- Output of
pip list
:
Package Version
----------------- -------
aiobotocore 1.1.2
aiohttp 3.6.2
aioitertools 0.7.0
async-timeout 3.0.1
attrs 20.2.0
botocore 1.17.44
chardet 3.0.4
docutils 0.15.2
fsspec 0.8.3
idna 2.10
idna-ssl 1.1.0
jmespath 0.10.0
multidict 4.7.6
pip 20.2.3
pkg-resources 0.0.0
python-dateutil 2.8.1
s3fs 0.5.1
setuptools 50.3.0
six 1.15.0
typing-extensions 3.7.4.3
urllib3 1.25.10
wheel 0.35.1
wrapt 1.12.1
yarl 1.6.0
Issue Analytics
- State:
- Created 3 years ago
- Comments:8 (3 by maintainers)
Top Results From Across the Web
S3 list operation requested when file not found #382 - GitHub
I would expect to see a FileNotFoundError exception being raised only. No list operations should be requested. It seems unnecessary and AWS S3 ......
Read more >Resolve errors uploading data to or downloading data from ...
To load data as a text file from Amazon Aurora into Amazon S3, run the SELECT ... Incorrect Command: missing file/prefix/manifest keyword ...
Read more >Quick way to list all files in Amazon S3 bucket? - Stack Overflow
AWS CLI can let you see all files of an S3 bucket quickly and help in performing other operations too. To use AWS...
Read more >S3 API operations - IBM
Provides a list of buckets for this object client node. In IBM Spectrum Protect, buckets are represented by file spaces. No parameters are...
Read more >10 things you should know about using AWS S3 - Sumo Logic
Learn how to optimize Amazon S3 with top tips and best practices. Bucket limits, transfer speeds, storage costs, and more – get answers...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The problem with this is, that you are ignoring the file listing cache, so that you do the lookup on every open, even if you have previously listed the prefix. You could have a middle-ground of checking
self._ls_from_cache
first, but you do end up repeating code.Would it be desirable to modify S3FileSystem._open() so that it does something like this before attempting to open a file?
According to this,
translate_boto_error(e)
will return aFileNotFoundError
whene.response['Error'].get('Code')
is'404'
.In the meantime, I’m using
file_system.s3.head_object()
as a workaround.