question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Adjust throttling for 429 response codes

See original GitHub issue

HTTP 429 response code is returned when we reach a rate limit for an API at a given time. Usually, it is a matter of waiting a bit to start sending new requests. The “problem” is that, if concurrency settings are greater than the allowed number of requests from the API, we’ll always get 429s.

A solution would be to tune throttling so it delays requests based on 429s. It can be a extension/middleware, as AutoThrottle seems quite specific for throttle control over latency.

Also it could be worth considering that some APIs return the waiting time https://github.com/scrapy/scrapy/issues/3849

Here is a previous PR for this https://github.com/scrapy/scrapy/pull/3061

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:14 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
caffeinatedMikecommented, Mar 25, 2021

@Gallaecio I think a per-domain basis would be more fitting for my use-case. However, with #5015 close to being approved (I’ve been eagerly following that discussion), I do believe that functionality will make it easier to account for status codes for throttling by subclassing the existing AutoThrottle middleware. Though, I feel this functionally would fall under a common dilemma, thus calling for an official middleware to address it.

0reactions
Gallaeciocommented, Feb 10, 2022

From the specification:

Note that this specification does not define how the origin server identifies the user, nor how it counts requests. For example, an origin server that is limiting request rates can do so based upon counts of requests on a per-resource basis, across the entire server, or even among a set of servers. Likewise, it might identify the user by its authentication credentials, or a stateful cookie.

So assuming that a 429 means “too many request to this domain”, it may mean “too many requests to this specific endpoint”, “too many requests with this cookie”, and so on. Maybe we could assume the domain scenario by default, since it is probably the most likely in web crawling, but ideally we should figure out a way to allow flexibility to deal with other scenarios effectively.

Read more comments on GitHub >

github_iconTop Results From Across the Web

429 Too Many Requests - HTTP - MDN Web Docs
The HTTP 429 Too Many Requests response status code indicates the user has sent too many requests in a given amount of time...
Read more >
What an HTTP Error 429 Means & How to Fix It - HubSpot Blog
Learn what the HTTP error 429 status code means, and how to resolve it to get your site up and running ... Set...
Read more >
Implementing 429 retries and throttling for API rate-limits - Anvil
The first thing we need to nail down is how to handle the error responses when the API limits are exceeded. If you...
Read more >
Handle throttling problems, or '429 - Azure Logic Apps
In Azure Logic Apps, your logic app returns an "HTTP 429 Too many requests" error when experiencing throttling, which happens when the ...
Read more >
Troubleshoot API Gateway"429 Too Many Requests" or "Limit ...
Exceeding the throttling limit or quota returns a "429 Too Many Requests" or "Limit Exceeded" error response. For more information, see How ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found