question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Thread safety and suggested lifetime patterns of storage classes not documented

See original GitHub issue

The docs (e.g., https://docs.microsoft.com/en-us/dotnet/api/microsoft.windowsazure.storage.cloudstorageaccount) don’t mention anything about the thread safety or suggested lifetime/usage patterns of CloudStorageAccount, CloudBlobClient, CloudBlobContainer, etc.

In a server application, how are those supposed to be created and retained? E.g., CloudStorageAccount and CloudBlobClient instances as global singletons, CloudBlobContainer per call? The sample at https://github.com/Azure-Samples/storage-blobs-dotnet-webapp/blob/master/WebApp-Storage-DotNet/Controllers/HomeController.cs instantiates all these classes per call - is this the suggested usage or is it actually expensive to reinstantiate them every time?

(The “Cloud_Storage_Account_Sample” link at https://docs.microsoft.com/en-us/dotnet/api/microsoft.windowsazure.storage.cloudstorageaccount?view=azure-dotnet doesn’t work, BTW, but even if it did, the sample doesn’t show how to use those classes in production.)

Issue Analytics

  • State:open
  • Created 5 years ago
  • Reactions:63
  • Comments:56 (8 by maintainers)

github_iconTop GitHub Comments

70reactions
AGBrowncommented, Nov 15, 2019

Well, a lot of +1, a non-trivial number of watchers and several brief comments asking for help aren’t prompting a response so here goes …

Nothing in the original issue report has yet been addressed in all this discussion over the last 16 months.


Breaking this issue down:

I agree with the following:

@kfarmer-msft I suggest splitting this issue into two, first is a simple question “are storage client functions thread safe?” this should be easy enough to answer, and for Docs to be updated. I’m more interested in this question, as an API consumer I do need to know about thread safety. While the other part is still related “lifetime guidance”, it might take longer due to the nature of the question. I hope this approach can expedite the answer a little bit.

What documentation do we currently have, if any?

If we look at Azure / Function / How-to guides / Develop / Manage Connections / Static Clients it says what @moshegutman quoted previously (version-specific link):

Azure Storage clients can manage connections if you use a single, static client.

but it also says:

For more information about why we recommend static clients, see Improper instantiation antipattern.

Which in turn says (version-specific link):

examples of broker classes that are relevant to Azure applications … QueueClient

There is no mention of storage classes and whereas QueueClient actually has got documentation on thread-safety and storage patterns (if you can stumble across it), storage broker classes don’t seem to have corresponding documentation even under the equivalent documentation section.

That last page then goes on to say:

  • The key element of this antipattern is repeatedly creating and destroying instances of a shareable object. If a class is not shareable (not thread-safe), then this antipattern does not apply.

  • The type of shared resource might dictate whether you should use a singleton or create a pool. The HttpClient class is designed to be shared rather than pooled. Other objects might support pooling, enabling the system to spread the workload across multiple instances.

Finally we have the cited sample code which is just plain bad code and no help at all. (Overwriting a static field on every call to Index??? 😱 ).

Why is this documentation not helping?

So here’s the kicker: this issue is all about the fact that (other than one throw-away reference hidden in the azure functions documentation with no supporting documentation in the actual documentation for the relevant classes):

  1. there is no documentation of thread-safety on the classes such as CloudBlobClient or the newer v12 BlobServiceClient etc.; and
  2. there is no documentation on lifetime or usage patterns for these same classes.

Other asides in this thread.

To revisit @kfarmer-msft (well-meaning) comment:

perf questions are best answered by determining what a realistic performance goal for your application is, making a straightforward implementation of it, and then measuring.

I respectfully, but firmly, take the view that this isn’t answering the issue. This may answer the OP’s “is it actually expensive to reinstantiate them every time” question, but not the underlying problem. It may be a performance optimisation to work out if object re-use can improve performance, but the other aspect is that it is fundamental to building a basic application to know how to use objects that own and use expensive network resources in order to address scaling and to prevent resource starvation. 1

If we are forced to do what you recommend then we are faced with the choice of either (a) never getting a project off the ground as it will become mired in blackbox testing; or (b) risking immediate failures on deployment to production . This is the exact reason why we now have HttpClient and HttpClientFactory with authoritative guidance on how to use them. There is only one way to use HttpClient, not different ways you choose depending on your best guess of usage patterns, performance and expected scale.

What are our current options, as customers of the storage SDK?

The problem with the current lack of documentation is that we are doomed if we do and doomed if we don’t:

  • If we “guess” that there will be scaling issues then we choose to use singleton instances, but if they aren’t thread-safe then we’ll get production time runtime exceptions pretty quickly.
  • If we “guess” that they aren’t thread-safe we have to implement appropriate pools (currently by hand, which isn’t simple in itself ), but if they aren’t pool-friendly then we’ll get production time runtime exceptions pretty quickly.
  • If we “guess” they aren’t thread-safe or pool-friendly then we have to choose to do one-instance-per-call, but then we might hit port exhaustion or performance issues pretty quickly at production time.

TL;DR:

There is no guidance for thread-safety for critical classes in the Azure SDK like CloudBlobClient or CloudBlobContainer or the newer v12 BlobServiceClient etc… We need a documentation point for each critical type on each of: thread-safety documentation; concurrent async calls; and lifetime recommendations. At the very least we need this guidance for the azure storage classes in a page like this one, (although this might be better placed at the type and member level for azure storage SDK classes and it would be nice to have a consistent approach across all the azure SDKs).


Footnote: When I want to know how to use a type I expect to go to the [type’s documentation page like BlobContainerClient or CloudBlobClient instead of having to stumble across unrelated pages like azure functions documentation. At this point in time most of those type documentation pages offer nothing beyond simple function overload parameter intellisense in the IDE. A thread-safety/concurrency section for types/members as appropriate, and “lifetime and usage pattern” recommendation (singleton, pooled, transient) section for types would completely sort this issue out. It would also handle consistent documentation for legacy versions after major-version, namespace and even assembly name/nuget changes.

Even Microsoft.Extensions.ObjectPool doesn’t handle things like IDisposable or IDisposableAsync, or pooling objects by key, and building async-friendly, disposable friendly, keyed pools isn’t a trivial piece of work.

Maybe they expire old connections in the background and don’t silently stand them up again? So you can get non-deterministic errors at runtime due to stale connections

1 Also it is not the role of a customer of an SDK to blackbox test every class for thread-safety, race conditions, and performance and adjust their usage of the SDK accordingly, and apart from anything else if we were all to do this we’d each be reinventing the wheel every time and it would be a massive waste of global resources)!

64reactions
fschmiedcommented, Nov 18, 2018

@kfarmer-msft My question arose from the fact that unit of work scoping (or per call scoping) is not the way an HttpClient should be used, as this pattern could cause port exhaustion if many units of work - e.g., web requests - are processed in a short time. So I checked the documentation for the Azure Storage clients and found - nothing.

It’s cool that you recommend a unit of work pattern and that apparently this will minimize the creation of HttpClient instances at the moment, but I’m uneasy when you say - as I understand it - that you cannot guarantee this will hold for the future. I can’t build my application on a “one storage client per unit of work” pattern now and have it run into port exhaustion after updating the Azure client libraries.

As always, perf questions are best answered by determining what a realistic performance goal for your application is, making a straightforward implementation of it, and then measuring.

I have no specific performance question. My goal is to use the storage clients in a way that is capable of parallel processing/multi-threading and will not cause port exhaustion. I really think the Azure Storage team needs to document how I can reach this goal. (In the API documentation, not some random Github issue.)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Document suggested lifetime patterns for clients #8941
Thread safety and suggested lifetime patterns of storage classes not documented Azure/azure-storage-net#732.
Read more >
Is BlobContainerClient in azure c++ sdk thread safe?
I think this MS issue is relevant Thread safety and suggested lifetime patterns of storage classes not documented. – Richard Critten.
Read more >
Reading 20: Thread Safety
Our first way of achieving thread safety is confinement. Thread confinement is a simple idea: you avoid races on mutable data by keeping...
Read more >
Is the singleton pattern prone to thread safety problems?
Yes, as other answers have noted, it can be implemented in a thread safe manner. That said, it tends to be prone to...
Read more >
Thread safety with the Azure SDK for .NET
Learn about thread safety in Azure SDK client objects and how this design impacts client lifetime management.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found