Compressed file ended before the end-of-stream marker was reached
See original GitHub issueI try to download files from third-party API and upload them to S3.
with open(url, 'rb') as fin:
with open(f"s3://{BUCKET}/{fpath}", 'wb') as fout:
for line in fin:
fout.write(line)
Mostly smart_open works fine with other files but sometimes I’m getting an error
ERROR:root:Compressed file ended before the end-of-stream marker was reached
For example I’m getting an error with this file https://datasets.tardis.dev/v1/deribit/options_chain/2020/08/01/OPTIONS.csv.gz but with browser I can download it without issues.
My log:
INFO:root:Opened https://datasets.tardis.dev/v1/deribit/options_chain/2020/08/01/OPTIONS.csv.gz
INFO:botocore.credentials:Found credentials in shared credentials file: ~/.aws/credentials
INFO:smart_open.s3:smart_open.s3.MultipartWriter('my-bucket', 'tardis/table_20200801.csv.gz'): uploading part_num: 1, 52456530 bytes (total 0.049GB)
INFO:smart_open.s3:smart_open.s3.MultipartWriter('my-bucket', 'tardis/table_20200801.csv.gz'): uploading part_num: 2, 52437910 bytes (total 0.098GB)
INFO:smart_open.s3:smart_open.s3.MultipartWriter('my-bucket', 'tardis/table_20200801.csv.gz'): uploading part_num: 3, 5235580 bytes (total 0.103GB)
ERROR:root:Compressed file ended before the end-of-stream marker was reached
Issue Analytics
- State:
- Created 2 years ago
- Comments:11 (1 by maintainers)
Top Results From Across the Web
EOFError: Compressed file ended before the end-of-stream ...
I am getting the following error when I run mnist = input_data.read_data_sets("MNIST_data", one_hot = True) . EOFError: Compressed file ended ...
Read more >EOFError: Compressed file ended before the ... - QIIME 2 Forum
Compressed file ended before the end-of-stream marker was reached. Debug info has been saved to /tmp/qiime2-q2cli-err-qper0vdr.log.
Read more >[traceback] "EOFError: Compressed file ended before the end ...
[traceback] "EOFError: Compressed file ended before the end-of-stream marker was reached" in OutOfMemoryBinaryRule.
Read more >EOFError: Compressed file ended ... - NVIDIA Developer Forums
EOFError: Compressed file ended before the end-of-stream marker was reached ... Dear Sir or Madam: While I run the babi_rnn.py from keras/examples ...
Read more >Python – EOFError: Compressed file ended before the end-of ...
Python – EOFError: Compressed file ended before the end-of-stream marker was reached – MNIST data set. pythontensorflow. I am getting the following error ......
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
It actually is the inbound connection breaking. Here’s code that reproduces it:
If you run that several times, you’ll see that it never makes it to the end of the download (around 570MB).
If you reduce the sleep time to e.g. 1s, you’ll be able to stream that file from the server successfully.
I think the server is dropping the connection if it thinks the client is idling for too long. Unfortunately, this “idling” is what smart_open is using to upload the multipart part to S3. You can attempt to reduce it by using a smaller multipart part size, e.g. from your original code
Here I’m using a 5MB instead of the default 50MB.
I’m going to close this for now because it really isn’t a problem with smart_open. The server is flat out hanging up on us in a way that we cannot easily detect.