Streaming Uploads?
See original GitHub issueHey,
Sorry for treating this as a mailing list, I didn’t see any other method for contact, so I went ahead and opened an issue.
I’m trying to use boto3 to upload files uploaded to PyPI to S3. The majority of these files will be < 60MB but a handful of them will be larger (up to a few hundred MB in size). I’m trying to figure out what the right interface to use to do this is. Right now, in PyPI we have a streaming upload from the client along with the expected MD5 hash of the entire file once it’s been uploaded. I’m wondering if I can do something like:
import hashlib
class HashingFileWrapper:
def __init__(self, wrapped, md5_hash):
self.wrapped = wrapped
self.md5_hash = md5_hash
self.hash_ctx = hashlib.md5()
def read(self, *args, **kwargs):
chunk = self.read(*args, **kwargs)
self.hash_ctx.update(chunk)
if not chunk:
if self.hash_ctx.hexdigest() != self.md5_hash:
raise ValueError("Hash Does Not Match")
my_s3_object.put(
Body=HashingFileWrapper(file_like_object, md5_hash),
ContentLength=file_size,
ContentMD5=md5_hash,
)
Will that stream it up to S3 without buffering the whole file in memory? If not, is my only option to buffer the data to a temporary file and then use the my_s3_object.upload_file()
interface?
Issue Analytics
- State:
- Created 8 years ago
- Comments:13 (8 by maintainers)
Top Results From Across the Web
Streaming uploads | Cloud Storage - Google Cloud
Cloud Storage supports streaming data to a bucket without requiring that the data first be saved to a file. This is useful when...
Read more >Streamable: Upload Video Online - Free Video Hosting
Upload your video in seconds on Streamable. We accept a variety of video formats including MP4, MOV, AVI, and more. It's free, try...
Read more >Watch Upload - Season 1 | Prime Video - Amazon.com
From the Emmy-winning Greg Daniels (The Office, Parks & Rec) comes a hilarious new sci-fi comedy. In the future people can upload their...
Read more >Upload | Where to Stream and Watch - Decider
Looking to watch Upload? Find out where Upload is streaming, if Upload is on Netflix, and get news and updates, on Decider.
Read more >What is a good upload speed for streaming? - Restream
For 720p video at 30 or 60 frames per second, aim for an upload speed of roughly 3 to 4 Mbps. Twitch: For...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yes it does.
Although not (yet?) mentioned in Botocore’s S3.Client.put_object() 's document, the Botocore S3.Client.put_object() does accept a file-like object. There is even a test case to ensure that. You won’t find the streaming implementation in the code base here, because it is actually supported by the underlying library, requests.
Both Boto 3’s Object.put() and Bucket.put_object() are calling Botocore’s put_object(), so they support streaming as well. It is mentioned here.
The higher level S3Transfer in Boto3 provides more handy features. Its upload_file() accepts a filename, and it will automatically split the big file into multiple chunks with default size as 8MB and default concurrency of 10, and each chunk is streaming through the aforementioned low level APIs.
Hi rayluo , Can we send actual data buffer as parameters instead of filename to upload_file () in boto3 ?
With put_object(), I am suffering with high memory footprint. Is there something cleanup call which I am missing after put_object() call ?