Puppeteer/chrome does not follow/show valid 302 redirects when the final URI is not known
See original GitHub issueWe are using Puppeteer to verify that advertiser links are redirecting to the play store as they are supposed to. Advertisers use a combination of 302 server redirects and front-end javascript redirects (with 200 response codes), making Puppeteer a great choice to evaluate final redirect destinations. The issue is that 302 redirects are not being redirected to in Puppeteer.
Steps to reproduce
- Puppeteer version: 0.13.0
- Platform / OS version: Mac
- URLs (if applicable): http://appclk.me/store.php?page=1
What steps will reproduce the problem?
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.emulate(determineDevice(device)); // returns the correct device from DeviceDescriptors.js
const finalResponse = await page.goto("http://appclk.me/store.php?page=1", {waitUntil: 'networkidle0'});
process.stdout.write(JSON.stringify({
success: true,
statusCode: finalResponse.status,
url: finalResponse.url
}));
process.exit(0);
})();
What is the expected result?
The example URL used, http://appclk.me/store.php?page=1
returns a 200 status code that uses meta and javascript redirects that point to http://appclk.me/store.php?page=2
. This subsequent URL redirects with a 302 to a market link: market://details?id=com.kabam.marvelbattle
. What SHOULD happen is one of the following:
- The final URL after the network is idle should be
http://appclk.me/store.php?page=2
with a status code of 302. We could use the Response headers to verify that it redirects to the market link. - OR The final URL after network is idle should be
market://details?id=com.kabam.marvelbattle
(though Chrome doesn’t know how to handle this URL.)
What happens instead?
The final URL that is printed is http://appclk.me/store.php?page=1
with a status code of 200.
Notes
We are seeing the same behavior in Chrome: open a new tab with the Network dev tools open and visit the first link. You will see that the second redirect is never shown in the network tab. This is strange behavior as the network tab should show all requests and responses. Other browsers like FF do show the second request in the Network tab.
I’m aware that this is something with Chrome and not Puppeteer, but I’m posting here in the hopes that someone will point out a flag or some option I’m not aware of that correctly shows ALL network requests so that we can accomplish our goal of verifying redirects properly. Any suggestions are welcome.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:6
Top GitHub Comments
I found the solution, which is to use
page.setRequestInterception()
What’s interesting here is that if I comment out
await page.setRequestInterception(true);
but still leave the request/response event listeners, the second request/response is NOT printed to the console. IfsetRequestInterception
IS used, then THREE requests are logged, with the second correctly displaying a 302, and the third being the “market://” link. Why would requesting to intercept requests change the events that are fired? Is this a bug?sometimes, the program goes to the event page.requestfailed rather than page.response, so can’t handle the 302 status, do you hava any solutions ?