question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Retries to 429 never succeed under certain configurations

See original GitHub issue

Describe the bug Under specific configuration, the Cosmos V3 client will fail to succeed on any number of retries to a 429 response.

To Reproduce All of the following configurations must be true to reproduce:

  • Use V3 Cosmos Client
  • Use ConnectionMode of Direct / TCP (this is the default)
  • Configure Cosmos account to use multiple regions
  • Configure Cosmos account for strong consistency
  • Configure a very generous amount of retries, with either Polly via a custom RequestHandler, or the built in retry configuration: cosmosClientBuilder.WithThrottlingRetryOptions(new TimeSpan(0, 5, 0), 400);

Next, produce enough load via UpsertItemAsync or ExecuteStoredProcedureAsync to receive 429 responses from Cosmos.

Expected behavior

  • Expect that some requests will receive a 429 and attempt some number of retries before succeeding.
  • Using the RequestHandler, adding metrics to the retries would show that fewer requests took 2 retries vs 1, and fewer took 3 vs 2, etc.

This expected behavior is observed if any of the above configurations are changed.

Actual behavior

  • Any request that receives a 429 will never succeed on subsequent retry attempts, even over the course of many minutes and many retries.
  • All requests that receive a 429 will iterate through all retry attempts, continually failing, until it is out of retry attempts and then the overall request ends with an exception.

Environment summary Microsoft.Azure.Cosmos Version: 3.3.2

.NET Core SDK (reflecting any global.json): Version: 2.1.504 Commit: 91e160c7f0

Runtime Environment: OS Name: Windows OS Version: 10.0.15063

Additional context

  • I only reproduced this with either Upserts or calling Stored Procs. I did not get enough load with reads to trigger 429s.
  • This works correctly in the v2 client, even when set to Direct / TCP.
  • If I use retries or polly outside of the Cosmos client, then it has no problem. That is, I can wrap UpsertItemAsync with a retry, and those retries will behave fine. Its only a problem for retries within the client (via the built in policy or via RequestHandler).

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:16 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
alexmartinezmcommented, Mar 19, 2020

Hi there, is there any update about this issue? I’m facing with the same bug. Thanks in advance.

2reactions
j82wcommented, Nov 1, 2019

Thanks for the repo, and clarification. I’m able to repo it using the code you provided, but I’m having issues getting it repo against the current master. It seems to be throwing 408 exception instead now. I’m still investigating.

Read more comments on GitHub >

github_iconTop Results From Across the Web

alchemy - Retrying after error: 429: Your app has exceeded ...
The error code 429 implies that Your app has exceeded its compute unit per second capacity. The requests, unfortunately, do not go through ......
Read more >
Implementing 429 retries and throttling for API rate-limits - Anvil
The first thing we need to nail down is how to handle the error responses when the API limits are exceeded. If you...
Read more >
What Does HTTP Error 429: Too Many Requests Mean? ...
HTTP Error 429 is an HTTP response status code that indicates the client application has surpassed its rate limit, or number of requests...
Read more >
c# - Throttling connections on 429 errors in WebClient ...
Since I'm relying on the response the web client gives, I never want to miss a response from this web client. I'm converting...
Read more >
Handle throttling problems, or '429 - Logic Apps
How to work around throttling problems or 'HTTP 429 Too many requests' errors in Azure Logic Apps.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found