question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Call openAsync during initialization to avoid this startup latency on the first request:

See original GitHub issue

Hi according to the performance guidelines, OpenAsync() needs to be called during initialization to avoid high latency in the first request, but I can’t find this API in this SDK. Is this being handled by the SDK now? Should I call a different method?

Thank you.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nahk-ivanovcommented, Apr 11, 2023

I guess I will have to resurrect this thread. Apparently, executing one operation does not “warm up” everything that needs to be “warmed up”. After every redeployment of the service, we are hit with 60-100 ms latency randomly for point-reads for some time, instead of the typical 1.6 ms. I traced the diagnostics for these and they have one thing in common, which is:

{
  "event": "ChannelAcquisitionStarted",
  "durationInMs": 106.6234
}

and

"connectionStats": {
  "waitforConnectionInit": "True",
}

Apparently, it needs to open new connections, even though there was already an operation (read) that has succeeded right before that. I looked at the code briefly and in Microsoft.Azure.Cosmos.Direct library (which doesn’t appear to be open sourced) it seems like internally it will have a “pool” of connections, and it uses hash of the Activity ID to determine which connection of the pool to use (this pertains to the Microsoft.Azure.Documents.Rntbd.LoadBalancingChannel, where GetLoadBalancedPartition hashes the Activity ID). So, if it happens that your two subsequent requests get activity ID that hashes to the same “partition”, then it will reuse the connection and you get low latency, but if third one then gets different “partition” and needs to open another connection - then you are busted. This is essentially purely “random” (same as Activity ID GUIDs that are in the core of the distribution), so the best you can do is just a probability-based number of requests that you may need to run to get all connections opened in p99 case, which is a little less than satisfactory.

I’m wondering if it is possible to get back some “warm up” API to the SDK, which will pre-open all connections for all channels and all channel “partitions”, because the suggested workaround above does not appear to be too helpful?

0reactions
bartelinkcommented, May 17, 2020

@Mortana89 if such a “prime the connections” operation were to exist, you’d definitely want to be able to do it for a single container only - lots of apps do not use all their containers in a given configuration, and establishing redundant connections wastes resources (not to mention triggers contention in such a warmup operation if lots of app instances regularly restart)

@j82w does your tracer bullet operation warm paths to all ranges of a container, or only one - i.e. a) does it ensure all db and container metadata retrieval is complete b) is the only remaining cacheable activity the establishment of the socket to each of the nodes of the container ?

Read more comments on GitHub >

github_iconTop Results From Across the Web

CosmosDb First Connection Can Take many Seconds
To avoid this startup latency on the first request, you should call OpenAsync() once during initialization as follows. static async System.
Read more >
Performance tips for Azure Cosmos DB and .NET SDK v2
When you use SDK V2, call OpenAsync() once during initialization to avoid this startup latency on the first request. The call looks like: ......
Read more >
Azure Cosmos DB and reliability
To avoid network latency, collocate client in the same region as Azure Cosmos DB. Call OpenAsync to avoid startup latency on first request....
Read more >
Reducing initial request latency by pre-building services in ...
In this post I show a startup task that pre-builds all the services registered in the DI container to reduce the duration of...
Read more >
Introduction to Azure Cosmos DB | Microsoft Docs
To avoid this startup latency on the first request, you should call OpenAsync() once during initialization as follows. 4. Collocate clients in same...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found