question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[BUG] BlockBlob `OpenWriteAsync` method takes twice as much time with the new storage package

See original GitHub issue

Describe the bug Writing a block blob with the OpenWriteAsync method takes twice as much time with the new Azure.Storage.Blob package compared to the deprecated WindowsAzure.Storage package.

Expected behavior Performance should have been improved or at least the same with the new Azure.Storage.Blobs package.

Actual behavior (include Exception or Stack Trace) Using the Azure.Storage.Blob package (version 12.9.1) the BlockBlobClient.OpenWriteAsync() method takes twice as long compared to CloudBlockBlob.OpenWriteAsync() from the now deprecated package WindowsAzure.Storage (version 9.3.3).

To Reproduce We noticed performance degradation for a functionality where we create a zip file in a storage container from a number of images. We have a container with 100 images and we have another container in the same storage account (General Purpose v1) where we store the resulting zip file.

First we open a stream to write the block blob zip file to a container with OpenWriteAsync() and then we create zip entries from multiple block blobs coming from a container in the same storage account.

  1. With new Azure.Storage.Blob package (on both net5.0 and netcoreapp3.1)
            // Create zip
            var zip = zipContainer.GetBlockBlobClient("media.zip");
            using (var zipArchive = new ZipArchive(
                stream: await zip.OpenWriteAsync(overwrite:true).ConfigureAwait(false),
                mode: ZipArchiveMode.Create,
                leaveOpen: false))
            {
                var swOpen = new Stopwatch();
                var swCopy = new Stopwatch();
                for(int i = 1; i <= 100; i++)
                {
                    sw.Start();
                    var blob = blobList[i];
                    var fileName = string.Format(CultureInfo.InvariantCulture,  "{0:D8}_{1}", i, "image.jpg");

                    var zipEntry = zipArchive.CreateEntry(fileName, CompressionLevel.NoCompression);
                    using var zipStream = zipEntry.Open();

                    swOpen.Start();
                    using var blobStream = await blob.OpenReadAsync();
                    swOpen.Stop();
                    swCopy.Start();
                    await blobStream.CopyToAsync(zipStream);
                    swCopy.Stop();

                    sw.Stop();
                    Console.WriteLine($"\tBlob {i} transfered in {sw.ElapsedMilliseconds} ms");
                    Console.WriteLine($"\t\tOpened in {swOpen.ElapsedMilliseconds} ms");
                    Console.WriteLine($"\t\tCopied in {swCopy.ElapsedMilliseconds} ms");
                    sw.Reset();
                    swOpen.Reset();
                    swCopy.Reset();
                }
            }

Result: image

  1. With the deprecated WindowsAzure.Storage package (targeting netcoreapp3.1)
            // Create zip
            var zip = zipContainer.GetBlockBlobReference("media.zip");
            using (var zipArchive = new ZipArchive(
                stream: await zip.OpenWriteAsync(),
                mode: ZipArchiveMode.Create,
                leaveOpen: false))
            {
                var swOpen = new Stopwatch();
                var swCopy = new Stopwatch();
                for(int i = 1; i <= 100 ; i++)
                {
                    sw.Start();
                    var blob = (CloudBlockBlob)blobList[i];
                    var fileName = string.Format(CultureInfo.InvariantCulture,  "{0:D8}_{1}", i, "image.jpg");

                    var zipEntry = zipArchive.CreateEntry(fileName, CompressionLevel.NoCompression);
                    using var zipStream = zipEntry.Open();

                    swOpen.Start();
                    using var blobStream = await blob.OpenReadAsync();
                    swOpen.Stop();
                    swCopy.Start();
                    await blobStream.CopyToAsync(zipStream);
                    swCopy.Stop();

                    sw.Stop();
                    Console.WriteLine($"\tBlob {i} transferred in {sw.ElapsedMilliseconds} ms");
                    Console.WriteLine($"\t\tOpened in {swOpen.ElapsedMilliseconds} ms");
                    Console.WriteLine($"\t\tCopied in {swCopy.ElapsedMilliseconds} ms");
                    sw.Reset();
                    swOpen.Reset();
                    swCopy.Reset();
                }
            }

Result: image

Environment:

  • Tested the deprecated WindowsAzure.Storage package with netcoreapp3.1 target framework, and the new Azure.Storage.Blob package with both netcoreapp3.1 and net5.0 target frameworks running in a Standard E8s v3 Azure VM, Windows 10 Enterprise

  • IDE and version : VS code / dotnet cli

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
kasobol-msftcommented, Jul 9, 2021

@hnuguse I was able to reproduce the issue. The new version of OpenWrite attempts to follow Stream contract more closely than what was there in earlier versions. I.e. the Flush/FlushAsync is fully operational by default. Which means that whenever the ZipArchive decides to flush it actually means the snapshot of data is materialized in the target blob - that means more requests made to storage and larger latency. See the trace for reference: image

There’s somewhat related issue opened here https://github.com/Azure/azure-sdk-for-net/issues/20652 where we discuss whether a flag disabling intermediate flushes should be added to the OpenWrite API

Meanwhile you can consider wrapping the Stream returned by OpenWrite to disable flushes. See sample for reference here https://gist.github.com/kasobol-msft/dd88c6a86f06dc981e0de96ef1169c56 . After applying workaround the time looks better: image

1reaction
hnugusecommented, Jul 8, 2021

The size is 2kb per image and we have 1000 of these in the container.

Transferring all of these takes a couple minutes, while the total size is only around 1mb. I attached the data we used for testing here media.zip

Read more comments on GitHub >

github_iconTop Results From Across the Web

BlobClient.OpenWriteAsync generates multiple ...
You are seeing multiple BlobCreated events when using OpenWriteAsync is because the method creates the blob in three steps.
Read more >
BlockBlobClient Class (Azure.Storage.Blobs.Specialized)
The BlockBlobClient allows you to manipulate Azure Storage block blobs. Block blobs let you upload large blobs efficiently. Block blobs are comprised of ......
Read more >
Startcopyfromuriasync
I want to upload (and eventually, download) a file to Azure Blob Storage. This sample code uses the StartCopyFromUriAsync () method to perform...
Read more >
Startcopyfromuriasync
You create or modify a block blob by writing a set of blocks and committing them by their block IDs ... Storage package...
Read more >
Uploading Blobs with the V12 Storage SDK
I have been working on updating lots of code from the V11 SDK to V12. ... To achieve this we can use the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found