question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[QUERY] CredentialUnavailableException handling when using DefaultAzureCredential and AzureSDKs

See original GitHub issue

Library name and version

Azure.Identity 1.4.1

Query/Question

We have infrastructure that is deployed to AKS. In order to connect our managed identity in this current subscription we assign the ManagedIdentity in our resource group to the AAD-PodIdentity system assigned managed identity. For the most part this is working well. Our service is an EventHubProcessor client that is started using Microsoft Generic Host framework and deployed to our AKS. This service also communicates with CosmosDB and Azure Storage Accounts for both the eventhubcheckpoint in blob and storage queues. The problem occurs when our application starts up and tries to authenticate to our services using the DefaultAzureCredential. It appeared as thought we fixed the issue by always “newing” up the DefaultCredential with the ManagedIdentity Client ID being passed in and all other authentication methods turned off. Unfortunately I discovered today that we started getting CredentialUnavailableExceptions for all of our services with the message “Endpoint not found”. I haven’t pinpointed exactly how this issue keeps appearing as it seems to happen immediately on startup which is ok because our istio sidecar proxy hasn’t applied the identity binding immediately. However, when this is stood up and our service is running the credential seems to lose authentication and crashes the application causing the pod to restart. The ask here is to figure out what we may need to do either in our code or with how our infrastructure is setup to get the proper credentials.

For services like the EventHubProducerClient, Storage, KeyVault we are using the AzureClientBuilder during registrations:

Func<IServiceProvider, TokenCredential> credentialFactory = (services) =>
				{
					var credentialFactory = services.GetRequiredService<IIdentityClientFactory>();
					return credentialFactory.GetTokenCredential();
				};

				builder
					.AddQueueServiceClient(serviceUri)

					.ConfigureOptions(
						opts =>
						{
							opts.MessageEncoding = QueueMessageEncoding.Base64;
						})
					.WithCredential((services) => credentialFactory.Invoke(services));

				builder
					.AddBlobServiceClient(blobServiceUri)
					.WithCredential((services) => credentialFactory.Invoke(services));

and passed a credential factory that only does this:

public TokenCredential GetTokenCredential()
		{
			DefaultAzureCredentialOptions azureCredentialOptions = TokenCredentialHelper.CreateCredentialOptions(_config["ManagedIdentityClientId"]);
			return new DefaultAzureCredential(azureCredentialOptions);
		}

However with other services like cosmos client we are just using the credential factory and passing it to the contructor of the cosmos client like so:

var credential = _identityClientFactory.GetTokenCredential();
cosmosClient = new CosmosClient(cosmosUri, credential, _cosmosConfig.ClientOptions);

The ask here is, what is the correct way to handle fetching new tokens when getting the CredentialUnavailableException? It seems the tokens are cached when instantiating these services and most have Scoped lifetimes to help reinitialize per event received. Is there some sort of retry functionality we should be including when CredentialUnavialbleException is thrown?

Example stacktrace:

Azure.Identity.CredentialUnavailableException: ManagedIdentityCredential authentication unavailable. No Managed Identity endpoint found.\n   at Microsoft.Azure.Cosmos.Routing.GlobalEndpointManager.GetAccountPropertiesHelper.GetAccountPropertiesAsync()\n   at Microsoft.Azure.Cosmos.GatewayAccountReader.InitializeReaderAsync()\n   at Microsoft.Azure.Cosmos.CosmosAccountServiceConfiguration.InitializeAsync()\n   at Microsoft.Azure.Cosmos.DocumentClient.InitializeGatewayConfigurationReaderAsync()\n   at Microsoft.Azure.Cosmos.DocumentClient.GetInitializationTaskAsync(IStoreClientFactory storeClientFactory)\n   at Microsoft.Azure.Cosmos.DocumentClient.EnsureValidClientAsync(ITrace trace)\n   at Microsoft.Azure.Cosmos.DocumentClient.GetCollectionCacheAsync(ITrace trace)\n   at Microsoft.Azure.Cosmos.ContainerCore.GetCachedContainerPropertiesAsync(Boolean forceRefresh, ITrace trace, CancellationToken cancellationToken)\n   at Microsoft.Azure.Cosmos.ContainerCore.GetPartitionKeyDefinitionAsync(CancellationToken cancellationToken)\n   at Microsoft.Azure.Cosmos.ContainerCore.ExtractPartitionKeyAndProcessItemStreamAsync[T](Nullable`1 partitionKey, String itemId, T item, OperationType operationType, ItemRequestOptions requestOptions, ITrace trace, CancellationToken cancellationToken)\n   at Microsoft.Azure.Cosmos.ContainerCore.CreateItemAsync[T](T item, ITrace trace, Nullable`1 partitionKey, ItemRequestOptions requestOptions, CancellationToken cancellationToken)\n   at Microsoft.Azure.Cosmos.ClientContextCore.RunWithDiagnosticsHelperAsync[TResult](ITrace trace, Func`2 task)\n   at Microsoft.Azure.Cosmos.ClientContextCore.OperationHelperWithRootTraceAsync[TResult](String operationName, RequestOptions requestOptions, Func`2 task, TraceComponent traceComponent, TraceLevel traceLevel)\n   at ResourceScheduler.Data.Infrastructure.Implementations.CosmosDbRepository`1.<>c__DisplayClass16_0.<<CreateAsync>b__0>d.MoveNext() in S:\\...\\CosmosDbRepository.cs:line 66\n--- End of stack trace from previous location ---\n   at....Implementations.CosmosDbRepository`1.<CosmosActionWrapper>z__OriginalMethod(String id, Func`1 action)

Environment

.NET 5 Generic Hosting Framework. Deployed to AKS cluster.

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
d-m4m4ncommented, Feb 23, 2022

In version 1.5.0, the retries are handled automatically by DefaultAzureCredential so you shouldn’t need to retry in your own code. Previously we just tried to make a TCP connection to the endpoint and failed if it did not connect on the first try after less than a second. The new code uses a default Retry policy.

Awesome. Well that solves this issue. Thanks so much for your help!

1reaction
christothescommented, Feb 22, 2022

In version 1.5.0, the retries are handled automatically by DefaultAzureCredential so you shouldn’t need to retry in your own code. Previously we just tried to make a TCP connection to the endpoint and failed if it did not connect on the first try after less than a second. The new code uses a default Retry policy.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Azure.Identity.CredentialUnavailableException: ...
Azure KeyVault: Azure.Identity.CredentialUnavailableException: DefaultAzureCredential failed to retrieve a token from the included credentials.
Read more >
Azure.Identity.CredentialUnavailableException
Our Azure Function is using managed identity and we are getting random error as below. Sometimes it can work but sometimes get /msi/token/ ......
Read more >
Azure SDK: What's new in the Azure Identity August 2020 ...
Each credential in the Azure Identity throws CredentialUnavailableException if it cannot find the required environment to authenticate.
Read more >
DefaultAzureCredential failed to retrieve a token from the ...
Resolution to "[CredentialUnavailableException: DefaultAzureCredential failed to retrieve a token from the included credentials." in Visual ...
Read more >
Does Azure.Identity library (e.g. DefaultAzureCredential ...
Azure SDKs themselves have a token caching feature in their HTTP pipeline so ... CredentialUnavailableException: DefaultAzureCredential failed to retrieve a ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found