question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

lib-storage should compute MD5 for parts

See original GitHub issue

Is your feature request related to a problem? Please describe.

UploadPart API supports specifying MD5/SHA256 to verify each part is received intact. lib-storage does not allow me to make use of this feature.

Describe the solution you’d like

Add a flag to the Upload constructor that either enables or disables automatic MD5 computation for each part.

Describe alternatives you’ve considered

Fork lib-storage.

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:3
  • Comments:8

github_iconTop GitHub Comments

5reactions
phillycheezecommented, Dec 11, 2021

Okay I figured out a workaround using the existing packages in this library. It seems like there is a package called @aws-sdk/middleware-apply-body-checksum that most of the commands in the client-s3 use to automatically generate a md5 hash and add the content-md5 header. I was able to get this to work via the following:

const { S3Client } = require('@aws-sdk/client-s3')
const { getApplyMd5BodyChecksumPlugin } = require('@aws-sdk/middleware-apply-body-checksum')

const client = new S3Client({
  region: 'us-east-2',
})
client.middlewareStack.use(
  getApplyMd5BodyChecksumPlugin(client.config)
)

This will inject the existing package into the middleware stack for all S3 client requests and then you can continue to use the Upload class from lib-storage as normal. There could be other types of requests that you don’t want this middle to run for, so you can conditionally do the middlewareStack.use above before your request and then use middlewareStack.remove('applyMd5BodyChecksumMiddleware') to remove it.

After running the workaround above, multipart uploads work like a charm when setting ObjectLockMode and ObjectLockRetainUntilDate in the upload params.

side note: it seems the Upload class expects a type of PutObjectCommandInput for the params config, but when you give it a stream it proceeds to never actually call “PutObject” since it needs to do a multi-part upload. When you compare AWS’s api requirements between PutObject and CreateMultiPartUpload there are some minor differences. One difference is the Content-MD5 which is expected in the PutObject command but isn’t used in the CreateMultiPartUpload command. Looking through the source it seems like some hacks have been made in the lib-storage library to set the Body key to undefined as a way around it. It’s a little odd to have params be validated against a command input type that it never even calls and it’s probably why it took so long to debug this issue since originally I kept getting back MalformedXML 400 status code errors from aws.

2reactions
phillycheezecommented, Mar 2, 2022

I spent several days to figure all of that out and at the end of the day, a MalformedXML error is really a bug with the sdk itself - not user error. I mention this because using v2 of the aws-sdk-js package might save you a bunch of your time and sanity! 😅

Read more comments on GitHub >

github_iconTop Results From Across the Web

Does @aws-sdk/lib-storage Upload - Multipart ... - GitHub
Multipart uploads must include the MD5 hash on each individual part. My files are many GB in size, so I want to utilize...
Read more >
@aws-sdk/lib-storage | AWS SDK for JavaScript v3
This abstraction enables uploading large files or streams of unknown size due to the use of multipart uploads under the hood. import {...
Read more >
Consolidate content md5 of various parts - MSDN - Microsoft
Hi,. We use java storage client library to transfer files to Azure. For a large file, we split the file in to multiple...
Read more >
Different Result When Calculating MD5 of Certain Part of a ...
Goals : Make a function to calculate certain part of a file where we skips certain bytes. Only certain parts of file to...
Read more >
Object Storage Service:Multipart upload - Alibaba Cloud
After these parts are uploaded, you can call ... OSS includes the MD5 hash of part data in the ETag header and returns...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found