question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] Large numbers of concurrent token refresh attempts cause a cache refresh convoy resulting in chronic 429 errors

See original GitHub issue

Describe the bug

In a scenario in which an application does the following:

  • utilizes a WithAppTokenProvider with a callback configured to fetch tokens from a managed identity endpoint
  • Issues hundreds of GetToken requests simultaneously
  • Optionally increases MaxRetries in the HttpClient pipeline used to fetch tokens

When such an application encounters a 429 response from the MI endpoint, this can result in a storm of requests and retry requests making the 429 problem worse. In addition, given the current behavior in MSAL for retries and cache access, all retries are guaranteed to result in a cache miss and will continue to fail as long as the MI endpoint does not return a successful token response.

Expected behavior

Retry attempts after the token cache is successfully refreshed should succeed via a cache hit rather than through a network request to the MI endpoint or authority. Only one request should be made to the endpoint to refresh the cache for any given cache entry and all other concurrent requests should consume that single result.

Actual behavior

Once the initial request fails with a retriable status code, all subsequent token requests do not attempt to read the cache and always result in an additional network request.

Reproduction Steps

Issue a large number of simultaneous GetToken requests with a ManagedIdentityCredential to induce a 429 response from the MI endpoint

Environment

Customer example was in Service Fabric, but this should reproduce in any managed identity environment in which a 429 response is possible.

Issue Analytics

  • State:open
  • Created 3 months ago
  • Comments:13 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
damiand2commented, Aug 16, 2023

Updating just MIcrosoft.Identity.Client package didn’t help

1reaction
damiand2commented, Aug 7, 2023

@bgavrilMS - just to make sure - when you wrote “Can you try to upgrade to use MSAL 5.54 or higher?” - you meant 4.54, right?

Read more comments on GitHub >

github_iconTop Results From Across the Web

429 response for refreshing access tokens
Hello, My application is using the OAuth2 Authorization Code Flow, and running into 429 errors when refreshing the access tokens.
Read more >
Various issues with refresh token rotation #3940
This creates an issue because the first request will be successful and obtain a new access_token paired with a refresh_token while all the...
Read more >
How to Fix 429 Too Many Requests Error
The HTTP 429 error is returned when too many requests are made to a page within a short period of time. Find out...
Read more >
Refresh token validation errors - Twitter Developers
When using OAuth2 with PKCE for user access, I'm getting inconsistent errors from refresh tokens which is incredibly frustrating and confusing.
Read more >
Refresh Token mechanism does not work well with many ...
Hi,. I'm pretty sure that current oauth token refresh mechanism doesn't always work as it should. I went through many threads on that...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found