S3 Multipart upload using threads
See original GitHub issueHi there,
It would be great to have an ability to upload chunks to s3 with ThreadPoolExecutor
, in other words to send each chunk in a separate thread to make uploading faster. This is how we do in my current project. I think it’s better to include this functionality to smart_open
. wdyt? let me know if it’s interesting and I’ll prepare a pull request.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Amazon S3 Multipart UploadPartRequest allows only single ...
I am trying to upload video files Amazon S3 using Multipart upload method in asp.net and I traced the upload progress using logs....
Read more >Uploading and copying objects using multipart upload
Multipart upload is a three-step process: You initiate the upload, you upload the object parts, and after you have uploaded all the parts,...
Read more >Multithreaded multipart uploader for s3 - gists · GitHub
Multithreaded multipart uploader for s3. GitHub Gist: instantly share code, notes, and snippets.
Read more >Multipart Uploads in Amazon S3 with Java - Baeldung
In this tutorial, we'll see how to handle multipart uploads in Amazon S3 with AWS Java SDK. Simply put, in a multipart upload, ......
Read more >Python Boto3 S3 multipart upload in multiple threads doesn't ...
The main reason for using multipart upload is to better utilise the available bandwidth that you have because (in general, and I'm skipping...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
But threads won’t help you in I/O bounds operations. They add more CPU, which is not the bottleneck (I think).
I appreciate your offer, but as part of your PR, we’d need to see convincing benchmarks first. Can you post some concrete numbers?
Threads always add a lot of complexity and maintenance and support headache, so unless the gains are obvious, I’m -1 on complicating the code.
@vryazanov Thank you for your interest in smart_open. Personally, I don’t think it’s worth including this functionality in smart_open, for several reasons.
First, there are already tools that handle that use case extremely well (for example, the AWS CLI).
Second, there’s a fair bit of effort to implement multi-threaded upload without breaking the existing abstraction of an upload being a write to a file stream. I’m not sure the benefits are worth the cost.