Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance problems intermittently when using Direct + TCP connection mode

See original GitHub issue

Describe the bug Performance problems intermittently when using Direct + TCP connection mode

To Reproduce Create documents at a slow rate, say 1 every 30 seconds

 _cosmosClient = new CosmosClient(
        connectionString,
        new CosmosClientOptions
        {
            SerializerOptions = new CosmosSerializationOptions
            {
                Indented = true,
                IgnoreNullValues = false,
                PropertyNamingPolicy = CosmosPropertyNamingPolicy.CamelCase
            },
        }
    );

Expected behavior Once the initial warmup is complete most create requests should occur with 1 second.

Actual behavior Intermittent performance tanks when a new connection needs to be established up to 3 seconds

Environment summary SDK Version: 3.20.1 OS Version (e.g. Windows, Linux, MacOSX): Linux & MacOS

Additional context During initial warmup, a connection is being established

DocDBTrace Information: 0 : Opened 1 channels to server rntbd://bluprdapp02-docdb-1.documents.azure.com:14078/
DocDBTrace Information: 0 : Awaiting RNTBD channel initialization. Request URI: 
    rntbd://bluprdapp02-docdb-1.documents.azure.com:14078/apps/b3102e89-1a41-41aa-b073-4499e437059e/services/ce2b2f6a-f698-449b-8885-400b488915ac/partitions/d95a78a2-6efc-4818-bbc2-f6f8a9993cbb/replicas/132640708438632430s/
DocDBTrace Information: 0 : RNTBD: ConnectUnicastPortAsync connecting to rntbd://bluprdapp02-docdb-1.documents.azure.com:14078/ (address 40.71.204.115)
DocDBTrace Information: 0 : RNTBD connection established 100.64.0.1:49861 -> 40.71.204.115:14078
DocDBTrace Information: 0 : RNTBD SSL handshake complete 100.64.0.1:49861 -> 40.71.204.115:14078

During a subsequent interaction (within 60 seconds), the transaction is interrupted by establishing a connection to a new replica (adding 500+ms overhead)

DocDBTrace Information: 0 : Opened 1 channels to server rntbd://bluprdapp02-docdb-1.documents.azure.com:14000/
DocDBTrace Information: 0 : Awaiting RNTBD channel initialization. Request URI: 
    rntbd://bluprdapp02-docdb-1.documents.azure.com:14000/apps/b3102e89-1a41-41aa-b073-4499e437059e/services/ce2b2f6a-f698-449b-8885-400b488915ac/partitions/d95a78a2-6efc-4818-bbc2-f6f8a9993cbb/replicas/132666907683438758s/
DocDBTrace Information: 0 : RNTBD: ConnectUnicastPortAsync connecting to rntbd://bluprdapp02-docdb-1.documents.azure.com:14000/ (address 40.71.204.115)
DocDBTrace Information: 0 : RNTBD connection established 100.64.0.1:49910 -> 40.71.204.115:14000
DocDBTrace Information: 0 : RNTBD SSL handshake complete 100.64.0.1:49910 -> 40.71.204.115:14000

Issue Analytics

State:
Created 2 years ago
Comments:14 (6 by maintainers)

Top GitHub Comments

1reaction

ealsurcommented, Jul 15, 2021

Can you share Diagnostics for the operations where you see high latency? 500ms is rather high for TCP requests if it’s within the same Azure region as the Cosmos DB endpoint.

1reaction

Jarloteecommented, Jul 15, 2021

Yes to performance tips, client is a singleton, i also added a custom telemetry handler to capture the tcp requests via the CustomHandlers functionality.