question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

CosmosClient stops working with regenerated, hot-reloaded connection strings

See original GitHub issue

Describe the bug I’m using Azure App Configuration as a way to reload config without rebooting my application. In Cosmos case, I add the primary connection string to App Configuration, and on first boot up everything works fine. I can replace said primary connection string with the secondary, dispose the CosmosClient and recreate it and it will keep working.

The problem comes as soon as I try to use a newly regenerated connection string. Whenever a CosmosClient is instantiated with this regenerated key, every request returns a 403 error. The connection string works if I use another separate application, but in the one that got the change via hot-reload it will not work until the webapp is restarted. If we go back to the connection string we just swapped out, it will work again.

To Reproduce 1.- Instantiate a singleton CosmosClient withthe primary connection string 2.- Without rebooting the application, regenerate the secondary connection string 3.- Dispose the first CosmosClient 4.- Instantiate a new CosmosClient with the regenerated connection string WITHOUT rebooting the application (I use a combination of OptionsMonitor and AppConfiguration) 5.- All subsequent requests will fail

Expected behavior This should work without issue, since we’re creating a new CosmosClient whenever the connection string changes. If the connection string was regenerated should not matter.

Actual behavior As soon as we try to use a newly generated connection string to create a new CosmosClient, everything will stop working until a restart is made.

Environment summary SDK Version: 3.22.0 OS Version: Windows 10

Additional context Here’s how we’re handling the lifecycle of the CosmosClient: image

This CosmosService is registered as singleton.

The weird thing is that the CosmosClient seems to be instantiated fine, we get no issue until we do some kind of query using the SDK. The message of the Exception that’s thrown is:

Response status code does not indicate success: Unauthorized (401); Substatus: 0; ActivityId: 346ff9e1-f42c-4faa-a35f-703a3a1c7bb7; Reason: (The input authorization token can't serve the request. Please check that the expected payload is built as per the protocol, and check the key being used. Server used the following payload to sign: 'get wed, 27 oct 2021 15:50:40 gmt ' ActivityId: 346ff9e1-f42c-4faa-a35f-703a3a1c7bb7, Microsoft.Azure.Documents.Common/2.14.0, Windows/10.0.14393 cosmos-netstandard-sdk/3.23.1); 

And the stack trace:

Microsoft.Azure.Cosmos.CosmosException : Response status code does not indicate success: Unauthorized (401); Substatus: 0; ActivityId: 346ff9e1-f42c-4faa-a35f-703a3a1c7bb7; Reason: (The input authorization token can't serve the request. Please check that the expected payload is built as per the protocol, and check the key being used. Server used the following payload to sign: 'get


wed, 27 oct 2021 15:50:40 gmt


ActivityId: 346ff9e1-f42c-4faa-a35f-703a3a1c7bb7, Microsoft.Azure.Documents.Common/2.14.0, Windows/10.0.14393 cosmos-netstandard-sdk/3.23.1);
   at Microsoft.Azure.Cosmos.GatewayStoreClient.ParseResponseAsync(HttpResponseMessage responseMessage, JsonSerializerSettings serializerSettings, DocumentServiceRequest request) in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\GatewayStoreClient.cs:line 124
   at Microsoft.Azure.Cosmos.GatewayAccountReader.GetDatabaseAccountAsync(Uri serviceEndpoint) in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\GatewayAccountReader.cs:line 57
   at Microsoft.Azure.Cosmos.Routing.GlobalEndpointManager.GetAccountPropertiesHelper.GetAndUpdateAccountPropertiesAsync(Uri endpoint) in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\Routing\GlobalEndpointManager.cs:line 300
   at Microsoft.Azure.Cosmos.Routing.GlobalEndpointManager.GetAccountPropertiesHelper.GetAccountPropertiesAsync() in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\Routing\GlobalEndpointManager.cs:line 195
   at Microsoft.Azure.Cosmos.GatewayAccountReader.InitializeReaderAsync() in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\GatewayAccountReader.cs:line 83
   at Microsoft.Azure.Cosmos.CosmosAccountServiceConfiguration.InitializeAsync() in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\Resource\Settings\CosmosAccountServiceConfiguration.cs:line 60
   at Microsoft.Azure.Cosmos.DocumentClient.InitializeGatewayConfigurationReaderAsync() in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\DocumentClient.cs:line 6597
   at Microsoft.Azure.Cosmos.DocumentClient.GetInitializationTaskAsync(IStoreClientFactory storeClientFactory) in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\DocumentClient.cs:line 958
   at Microsoft.Azure.Cosmos.TaskHelper.<>c__DisplayClass0_0.<<InlineIfPossibleAsync>b__0>d.MoveNext() in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\TaskHelper.cs:line 30
--- End of stack trace from previous location ---
   at Microsoft.Azure.Documents.BackoffRetryUtility``1.ExecuteRetryAsync(Func``1 callbackMethod, Func``3 callShouldRetry, Func``1 inBackoffAlternateCallbackMethod, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action``1 preRetryCallback)
   at Microsoft.Azure.Documents.ShouldRetryResult.ThrowIfDoneTrying(ExceptionDispatchInfo capturedException)
   at Microsoft.Azure.Documents.BackoffRetryUtility``1.ExecuteRetryAsync(Func``1 callbackMethod, Func``3 callShouldRetry, Func``1 inBackoffAlternateCallbackMethod, TimeSpan minBackoffForInBackoffCallback, CancellationToken cancellationToken, Action``1 preRetryCallback)
   at Microsoft.Azure.Cosmos.DocumentClient.EnsureValidClientAsync(ITrace trace) in C:\src\github\azure-cosmos-dotnet-v3\Microsoft.Azure.Cosmos\src\DocumentClient.cs:line 1481
--- Cosmos Diagnostics ---{"Summary":{},"name":"Typed FeedIterator ReadNextAsync","id":"a85cd026-7d9f-41b2-9084-83b4811de132","caller info":{"member":"OperationHelperWithRootTraceAsync","file":"ClientContextCore.cs","line":244},"start time":"02:02:36:852","duration in milliseconds":8.869,"data":{"Client Configuration":{"Client Created Time Utc":"2021-10-27T15:50:40.1699497Z","NumberOfClientsCreated":7,"User Agent":"cosmos-netstandard-sdk/3.22.0|3.23.1|7|X86|Microsoft Windows 10.0.14393|.NET 5.0.9|N|","ConnectionConfig":{"gw":"(cps:50, urto:10, p:False, httpf: False)","rntbd":"(cto: 5, icto: 600, mrpc: 30, mcpe: 65535, erd: False, pr: PrivatePortPool)","other":"(ed:False, be:False)"},"ConsistencyConfig":"(consistency: NotSet, prgns:[])"}},"children":[{"name":"Create Query Pipeline","id":"764d347a-3550-4f5c-9533-923fdc3860f2","caller info":{"member":"TryCreateCoreContextAsync","file":"CosmosQueryExecutionContextFactory.cs","line":85},"start time":"02:02:36:852","duration in milliseconds":8.2982,"children":[{"name":"Get Container Properties","id":"fe6057cb-02e6-454a-8cd9-ad43b7afb54f","caller info":{"member":"GetCachedContainerPropertiesAsync","file":"ClientContextCore.cs","line":391},"start time":"02:02:36:852","duration in milliseconds":8.2247,"children":[{"name":"Get Collection Cache","id":"66acd056-cfd8-4267-9497-1788b28703e3","caller info":{"member":"GetCollectionCacheAsync","file":"DocumentClient.cs","line":546},"start time":"02:02:36:852","duration in milliseconds":8.1605,"children":[{"name":"Waiting for Initialization of client to complete","id":"bf575923-99e1-48ba-8330-ae76c27b6fa2","caller info":{"member":"EnsureValidClientAsync","file":"DocumentClient.cs","line":1425},"start time":"02:02:36:852","duration in milliseconds":8.1061}]}]}]},{"name":"POCO Materialization","id":"ca967559-b22f-4ca4-9c58-3946bf213755","caller info":{"member":"ReadNextAsync","file":"FeedIteratorCore.cs","line":247},"start time":"02:02:36:861","duration in milliseconds":0.0491}]}

The way I’m doing this seems to work fine with other services, such as ServiceBus, Redis and SQL Server, so that’s why I opened the issue here.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
ealsurcommented, Nov 2, 2021

Yes, key rotations can take a long time, that is why the documentation says the steps are:

  1. Rotate Secondary key, wait and confirm it works. It can take from minutes to hours according to the docs.
  2. Switch app to use Secondary key.
  3. Rotate Primary key, same wait.
  4. Switch app to use new Primary key.

I will try to repro this scenario:

There’s still a quirk: If the CosmosClient is instantiated with the new connection string without waiting several minutes, it breaks and it will not recover, a new object needs to be created for this new connection string to work. It doesn’t matter how much we wait (we even waited for the whole weekend), that CosmosClient instance will never work.

And see what could be the issue.

0reactions
jcmartinez23commented, Nov 2, 2021

Tested again, waiting more or less ten minutes between regenerating and using the new connection string and it seems to work fine. I expected for it to work as soon as the Azure portal confirms that the regeneration has been completed successfully, but it seems is not the case.

There’s still a quirk: If the CosmosClient is instantiated with the new connection string without waiting several minutes, it breaks and it will not recover, a new object needs to be created for this new connection string to work. It doesn’t matter how much we wait (we even waited for the whole weekend), that CosmosClient instance will never work.

Still, thank you for the support and sorry for the bother. Keep up the good work.

Read more comments on GitHub >

github_iconTop Results From Across the Web

CosmosClient Class (Microsoft.Azure.Cosmos)
Creates a new CosmosClient with the connection string. CosmosClient is thread-safe. Its recommended to maintain a single instance of CosmosClient per lifetime ...
Read more >
Get started with Azure Cosmos DB for NoSQL using .NET
This article shows you how to connect to Azure Cosmos DB for NoSQL using the .NET SDK. Once connected, you can perform operations...
Read more >
Getting and Updating Connection Information for Azure ...
In the Azure Portal, we can get the connection string and key information (along with regenerating keys) from the Keys option under Settings....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found