ContentHash when using BlockBlobClient.OpenWriteAsync(true)
See original GitHub issueDescribe the bug I am upgrading from storage v11 to v12. Looking at Issue 17676 I understand that there are two modes depending on file size: Put Blob and Put Block / Put Blob List.
A test is failing which creates a small Blob using OpenWriteAsync(true), because the ContentHash is not being set. When I try and set ContentHash I am not allowed to, because it doesn’t match what the server calculates. If the server can calculate it, why isn’t it set?
Expected behavior When I set the BlobHttpHeaders explicity, then it should accept whatever I give it, or set it itself:
await using var outputStream = await blockBlobClient.OpenWriteAsync(true, new BlockBlobOpenWriteOptions
{
HttpHeaders = new BlobHttpHeaders
{
ContentHash = content.ToMd5()
}
});
I get:
The MD5 value specified in the request did not match with the MD5 value calculated by the server.
RequestId:6a06f4b4-a243-438a-bb83-f7fc23531ee1
Time:2021-03-04T17:08:10.0157094Z
Status: 400 (The MD5 value specified in the request did not match with the MD5 value calculated by the server.)
ErrorCode: Md5Mismatch
Additional Information:
UserSpecifiedMd5: mJNTIjPK/5jNCDoRawE8Cw==
ServerCalculatedMd5: 1B2M2Y8AsgTpgAmY7PhCfg==
Now the MD5 I calculate is wrong - it is the MD5 of the string rather than the gzipped string - my concern is as follows:
- If the server can calculate the MD5, then why doesn’t it set the value into ContentMD5? If you run this code without the headers then the ContentMD5 property is not set.
- How do I make all files sizes act consistently using OpenWriteAsync? I can’t see any obvious way to prevent the server calculting the MD5 and letting me set what I want - I think with UploadAsync I would use InitialTransferLength = 0, which would then use Put Block / Put Blob List for everything?
- If the server doesn’t set ContentMD5, then why isn’t this user settable to anything? I haven’t tested yet but presumably if the blob size was greater > 256MB, it would use Put Block / Put Blob List and the server couldn’t/wouldn’t calculate the Hash and I could set whatever I wanted? In this case the behaviour would be inconsistent.
- I don’t want to refactor to use UploadAsync if I can help it. The way streams are passed around in this app makes this undesirable.
Actual behavior (include Exception or Stack Trace) A small blob doesn’t get a ContentMD5 set, but knows what it should be and doesn’t let me set it to what I want.
To Reproduce
var content = "something small";
await using var outputStream = await blockBlobClient.OpenWriteAsync(true, new BlockBlobOpenWriteOptions
{
HttpHeaders = new BlobHttpHeaders
{
ContentHash = content.ToMd5()
}
});
await using var gZipStream = new GZipStream(outputStream, CompressionMode.Compress);
await using var streamWriter = new StreamWriter(gZipStream);
await streamWriter.WriteAsync(content);
Environment:
- Name and version of the Library package used: Azure.Storage.Blobs 12.8.0
- Hosting platform or OS and .NET runtime version (
dotnet --info
output for .NET Core projects): Windows 10 .NET Core 3.1 - IDE and version : Visual Studio 16.8.6
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (7 by maintainers)
Top GitHub Comments
Yes, you can set it to any value.
I meant any value to ContentHash/x-ms-blob-content-md5. I assume the answer is yes.