Glacier multipart upload - unable to send a precomputed checksum
See original GitHub issueI have an issue with higher than expected CPU usage on multipart upload with boto3. I have precomputed checksums for the data for each part, and send the checksum in the upload part request:
multipart_upload.upload_part(range=my_range, body=my_body, checksum=my_checksum)
This correctly adds the header x-amz-sha256-tree-hash
. However, when we get inside botocore/handlers.py
there is library code that reads the body client side and computes another checksum, adding also a x-amz-content-sha256
header.
- How can I specify the
x-amz-content-sha256
viaupload_part
method of a multipart upload resource? I want to pre-compute this client side, not have it automatically computed at the time of upload. - Why are two hashes required anyway? Isn’t just one checksum enough for uploading a part? We can still send both the payload checksum and the tree hash in the final complete upload stage of a multipart upload, of course.
Note: I’ve tried using the lower level client.upload_multipart_part
but that didn’t allow to explicitly specify the content sha256 either.
Issue Analytics
- State:
- Created 6 years ago
- Comments:9 (4 by maintainers)
Top Results From Across the Web
New – Additional Checksum Algorithms for Amazon S3
Multipart Object Upload – The AWS SDKs now take advantage of client-side parallelism and compute checksums for each part of a multipart upload. ......
Read more >Glacier multipart upload fails with ... - GitHub
Trying to implement a Glacier multipart upload feature in a Rails 3.0/Ruby 1.9.3 application using the new aws-sdk-core rc2.
Read more >complete-multipart-upload - glacier - Amazon AWS
The ListParts operation returns a list of parts uploaded for a specific multipart upload. It includes checksum information for each uploaded part that...
Read more >Amazon S3 Glacier Deep Archive - Noise
The Amazon S3 Glacier storage classes are purpose-built for data archiving, ... and compute checksums for each part of a multipart upload.
Read more >python - Multipart upload to Amazon Glacier: Content-Range ...
@Michael-sqlbot is quite right, the issue with the Content-Range was that I was passing the whole file instead of a part.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Greetings! It looks like this issue hasn’t been active in longer than one year. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.
@swetashre No this does not solve the use-case. We don’t want to send unsigned payload. We want to send the checksum that are already known, rather than have the client burn a lot of CPU to recalculate them (we have them already pre-calculated per chunk).
It’s a usability matter of allowing the client to specify them explicitly.