Javascript SDK DefaultAzureCredentials stops working under high load
See original GitHub issue- Package Name: @azure/identity"
- Package Version: 1.1.0
- Operating system: Linux
- nodejs
- version: 12.13.0
- browser
- name/version:
- typescript
- version:
- Is the bug related to documentation in
- README.md
- source code documentation
- SDK API docs on https://docs.microsoft.com
Describe the bug
We have a cluster with many pods running NodeJS. We use managed identity to access Azure resources, and for this we use DefaultAzureCredentials from the javascript SDK. What we observed is that under heavy load after some point some pods cannot get a token anymore, basically they end up in a zombie state and cannot access any Azure resource.
To Reproduce Many pods using managed identity.
Additional context Add any other context about the problem here.
We believe that the issue is that ManagedIdentityCredentials class caches the negative results, if a call timeouts then getting the token will not attempted anymore: https://github.com/Azure/azure-sdk-for-js/blob/dcae3ace0872180e0a542a00ca8c8c0b427def42/sdk/identity/identity/src/credentials/managedIdentityCredential.ts
// the latter indicating that we don't yet know whether
// the endpoint is available and need to check for it.
if (this.isEndpointUnavailable !== true) {
result = await this.authenticateManagedIdentity(
scopes,
this.isEndpointUnavailable === null,
this.clientId,
newOptions
);
// If authenticateManagedIdentity returns null, it means no MSI
// endpoints are available. In this case, don't try them in future
// requests.
this.isEndpointUnavailable = result === null;
} else {
const error = new CredentialUnavailable(
"The managed identity endpoint is not currently available"
);
logger.getToken.info(formatError(error));
throw error;
}
Issue Analytics
- State:
- Created 3 years ago
- Reactions:2
- Comments:10 (7 by maintainers)
Top GitHub Comments
@balazsmolnar To help us narrow this issue down, we’ll be asking you to test our latest Identity beta version once we release it, most likely today. I’ll follow up with instructions as soon as I’m able to. Thank you for your time!
@sadasant We tested the fix yesterday, and I’m happy to report that we did not experience any managed identiy related issue. Thnk you for the fix!