Retries to 429 never succeed under certain configurations
See original GitHub issueDescribe the bug Under specific configuration, the Cosmos V3 client will fail to succeed on any number of retries to a 429 response.
To Reproduce All of the following configurations must be true to reproduce:
- Use V3 Cosmos Client
- Use ConnectionMode of Direct / TCP (this is the default)
- Configure Cosmos account to use multiple regions
- Configure Cosmos account for strong consistency
- Configure a very generous amount of retries, with either
Polly
via a customRequestHandler
, or the built in retry configuration:cosmosClientBuilder.WithThrottlingRetryOptions(new TimeSpan(0, 5, 0), 400);
Next, produce enough load via UpsertItemAsync
or ExecuteStoredProcedureAsync
to receive 429 responses from Cosmos.
Expected behavior
- Expect that some requests will receive a 429 and attempt some number of retries before succeeding.
- Using the
RequestHandler
, adding metrics to the retries would show that fewer requests took 2 retries vs 1, and fewer took 3 vs 2, etc.
This expected behavior is observed if any of the above configurations are changed.
Actual behavior
- Any request that receives a 429 will never succeed on subsequent retry attempts, even over the course of many minutes and many retries.
- All requests that receive a 429 will iterate through all retry attempts, continually failing, until it is out of retry attempts and then the overall request ends with an exception.
Environment summary Microsoft.Azure.Cosmos Version: 3.3.2
.NET Core SDK (reflecting any global.json): Version: 2.1.504 Commit: 91e160c7f0
Runtime Environment: OS Name: Windows OS Version: 10.0.15063
Additional context
- I only reproduced this with either Upserts or calling Stored Procs. I did not get enough load with reads to trigger 429s.
- This works correctly in the v2 client, even when set to Direct / TCP.
- If I use retries or polly outside of the Cosmos client, then it has no problem. That is, I can wrap
UpsertItemAsync
with a retry, and those retries will behave fine. Its only a problem for retries within the client (via the built in policy or viaRequestHandler
).
Issue Analytics
- State:
- Created 4 years ago
- Comments:16 (7 by maintainers)
Top Results From Across the Web
alchemy - Retrying after error: 429: Your app has exceeded ...
The error code 429 implies that Your app has exceeded its compute unit per second capacity. The requests, unfortunately, do not go through ......
Read more >Implementing 429 retries and throttling for API rate-limits - Anvil
The first thing we need to nail down is how to handle the error responses when the API limits are exceeded. If you...
Read more >What Does HTTP Error 429: Too Many Requests Mean? ...
HTTP Error 429 is an HTTP response status code that indicates the client application has surpassed its rate limit, or number of requests...
Read more >c# - Throttling connections on 429 errors in WebClient ...
Since I'm relying on the response the web client gives, I never want to miss a response from this web client. I'm converting...
Read more >Handle throttling problems, or '429 - Logic Apps
How to work around throttling problems or 'HTTP 429 Too many requests' errors in Azure Logic Apps.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi there, is there any update about this issue? I’m facing with the same bug. Thanks in advance.
Thanks for the repo, and clarification. I’m able to repo it using the code you provided, but I’m having issues getting it repo against the current master. It seems to be throwing 408 exception instead now. I’m still investigating.