Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature request: Selective SetRequestInterception

See original GitHub issue

This one is coming from Puppeteer-Sharp.

A scenario would be:

the dev wants to download a certain resource (e.g., a video, a pdf, etc.)

Additionally, the dev only needs to intercept those specific resource types, having predefined URL patterns

At the moment, all interceptions are at Request stage, with a hardcoded pattern of “*”, which does not allow response body interceptions, nor selective interceptions.

What do you think guys about adding an optional array of patterns on setRequestInterception? It would be sent to Fetch.enable. I don’t know whether it would be useful to expose handleAuthRequests as well.

Issue Analytics

State:
Created 4 years ago
Reactions:1
Comments:8 (3 by maintainers)

Top GitHub Comments

1reaction

syscafedevelopmentcommented, Jul 17, 2019

In order to download stuff (retrieving response body, streamed or full), optimally, for large resources (like this):

@syscafedevelopment Let’s elaborate on this for a bit. First, you can already get response body if you wish with response.buffer().

Second, in your protocol walk-through, you discuss how we can intercept at response stage. Indeed, we can, but this discussion is orthogonal to selective interception we discuss here. There’s a separate feature request for response interception - #1191.

The title may be misleading. The final objective would be to allow intercepting responses, indeed. That is done by allowing selective interception, since it has to do with UrlPatterns being specified while Fetch.enable(). You may call it a confusing design decision at the CDP level, but that’s another story. In any case, it’s not that orthogonal to #1191 . and it’s actually related also to all the “allow downloads” discussions (e.g., the long awaited #299).

Regarding response.buffer() - imagine using that with multi-GB resources. In the best case scenario, the response will be loaded/parsed by the browser, in order to be displayed, then you will be able to get its buffer, iterating the bytes again (after the browser already iterated them), loading them to memory again, in one-go. After that you’ll probably need to iterate them yet again to save them to disk or whatever processing you need to do. That’s gonna be (at least) a RAM killer. In the worst case scenario, it will simply not work, throwing “Protocol errors”, for some cases.

What if the dev never wanted the (full) response data to be transmitted back from the server? What if based only on the headers of the response, or only on some partial response data, a dev would choose to skip that resource? At response.buffer() stage, it is too late - the response data has already been fully transmitted. You can now only re-iterate it again (as given by Chrome, not by the server at that time), if you wish.

What we need is a streamed way to do it, without being first handed to the browser (the response data, that is), and that has to do with intercepting at response stage.

After the stage is being set to Response (aka HeadersReceived), the dev can make use of the IO.read() method, in order to get the bytes of the response, before being treated by the browser, in order to save them to the disk or process them in any way, and choosing whether or not to forward them to the browser.

Moreover, when downloading comes in mind, i.e., when setting Response stage within the UrlPatterns parameter of Fetch.enable(), as a probable direct next-step for a dev, would be to also filter the URLs to only the ones needed to be downloaded (e.g., *.mov), since it’s done within the same parameter object. (patterns). Hence the globally-encompassing title of “Selective interception”. But I agree it’s more than just that.

0reactions

stale[bot]commented, Jul 27, 2022

We are closing this issue. If the issue still persists in the latest version of Puppeteer, please reopen the issue and update the description. We will try our best to accomodate it!