Performance problems intermittently when using Direct + TCP connection mode
See original GitHub issueDescribe the bug Performance problems intermittently when using Direct + TCP connection mode
To Reproduce Create documents at a slow rate, say 1 every 30 seconds
_cosmosClient = new CosmosClient(
connectionString,
new CosmosClientOptions
{
SerializerOptions = new CosmosSerializationOptions
{
Indented = true,
IgnoreNullValues = false,
PropertyNamingPolicy = CosmosPropertyNamingPolicy.CamelCase
},
}
);
Expected behavior Once the initial warmup is complete most create requests should occur with 1 second.
Actual behavior Intermittent performance tanks when a new connection needs to be established up to 3 seconds
Environment summary SDK Version: 3.20.1 OS Version (e.g. Windows, Linux, MacOSX): Linux & MacOS
Additional context During initial warmup, a connection is being established
DocDBTrace Information: 0 : Opened 1 channels to server rntbd://bluprdapp02-docdb-1.documents.azure.com:14078/
DocDBTrace Information: 0 : Awaiting RNTBD channel initialization. Request URI:
rntbd://bluprdapp02-docdb-1.documents.azure.com:14078/apps/b3102e89-1a41-41aa-b073-4499e437059e/services/ce2b2f6a-f698-449b-8885-400b488915ac/partitions/d95a78a2-6efc-4818-bbc2-f6f8a9993cbb/replicas/132640708438632430s/
DocDBTrace Information: 0 : RNTBD: ConnectUnicastPortAsync connecting to rntbd://bluprdapp02-docdb-1.documents.azure.com:14078/ (address 40.71.204.115)
DocDBTrace Information: 0 : RNTBD connection established 100.64.0.1:49861 -> 40.71.204.115:14078
DocDBTrace Information: 0 : RNTBD SSL handshake complete 100.64.0.1:49861 -> 40.71.204.115:14078
During a subsequent interaction (within 60 seconds), the transaction is interrupted by establishing a connection to a new replica (adding 500+ms overhead)
DocDBTrace Information: 0 : Opened 1 channels to server rntbd://bluprdapp02-docdb-1.documents.azure.com:14000/
DocDBTrace Information: 0 : Awaiting RNTBD channel initialization. Request URI:
rntbd://bluprdapp02-docdb-1.documents.azure.com:14000/apps/b3102e89-1a41-41aa-b073-4499e437059e/services/ce2b2f6a-f698-449b-8885-400b488915ac/partitions/d95a78a2-6efc-4818-bbc2-f6f8a9993cbb/replicas/132666907683438758s/
DocDBTrace Information: 0 : RNTBD: ConnectUnicastPortAsync connecting to rntbd://bluprdapp02-docdb-1.documents.azure.com:14000/ (address 40.71.204.115)
DocDBTrace Information: 0 : RNTBD connection established 100.64.0.1:49910 -> 40.71.204.115:14000
DocDBTrace Information: 0 : RNTBD SSL handshake complete 100.64.0.1:49910 -> 40.71.204.115:14000
Issue Analytics
- State:
- Created 2 years ago
- Comments:14 (6 by maintainers)
Top Results From Across the Web
Troubleshooting intermittent outbound connection errors in ...
This article helps you troubleshoot intermittent connection errors and related performance issues in Azure App Service.
Read more >Asp.net core 3 application slow to load cosmos db query
To summarize for the purpose of this answer, creating an instance of Cosmos Client (with "Direct" connection mode) does not do much.
Read more >Windows 7 intermittently drops wired Internet/LAN connection
Go to device manager, find the NIC and view Power Management under properties. There's a check box that says "allow the computer to...
Read more >7 Common Network Issues and How to Resolve Them Fast
Slow performance is typically due to congestion, or sometimes poor quality connections that have corroded or otherwise deteriorated. Congestion ...
Read more >TCP Connection Metrics in Metric Browser
Metric Name Description and Notes Default Monitoring Mode
Pkts per Sec Average rate of packets sent and received KPI
Rx Pkts per Sec Average rate...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Can you share Diagnostics for the operations where you see high latency? 500ms is rather high for TCP requests if it’s within the same Azure region as the Cosmos DB endpoint.
Yes to performance tips, client is a singleton, i also added a custom telemetry handler to capture the tcp requests via the CustomHandlers functionality.