question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Only get domain redirects and possible bug found

See original GitHub issue

Steps to reproduce

  • Puppeteer version: 0.13.0

What steps will reproduce the problem?

I’m trying to get all domain redirects using the Puppeteer api, saved to an array before taking a screenshot of the final URL, but the code I have so far is also getting other redirects.

For example if I goto youtube.com then my code will correctly get the redirects ‘https://youtube.com/’, ‘https://www.youtube.com/’, but it will also get other redirects such as doubleclick.net.

I only want to get the redirects which would happen in the URL bar.

I’ve managed to narrow it down with request.resourceType === ‘document’. How can I narrow it down further??

Here’s the code:

    // node chrome.js http://youtube.com

    const puppeteer = require('puppeteer');
    var url = process.argv[2];

    (async () => {

    const browser = await puppeteer.launch({headless: true, timeout: 30000, ignoreHTTPSErrors: true});
    const page = await browser.newPage();
    // await page.setRequestInterception(true); // hangs with resourcetype

    const urls = [];

    page.on('request', request => {
      // if (request.resourceType === 'document' || request.resourceType === 'script') {
      if (request.resourceType === 'document') { 
      urls.push(request.url);
      request.continue();
      }
    });

    await page.goto(url, {timeout: 20000, waitUntil: 'load'}); //default load

    await page.screenshot({path: 'test.jpg', type: 'jpeg', quality: 80, fullPage: false});
    console.log(urls);

    await browser.close();
    })();

I’ve also found what I believe to be a bug where using await page.setRequestInterception(true); with request.resourceType === 'document causes the script to hang forever (untill timeout). The bug is apparent using the above script.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
leem32commented, Dec 12, 2017

@aslushnikov Thanks 😃 That would be incredibly useful to me as I’ve hit a dead-end until I can get all redirects from both client and server side.

I’ve been fiddling around with the framenavigated event, to get client-side redirects and using the response event for server-side redirects, but once there’s a client-side redirect the response event obviously stops following and so doesn’t register any server-side redirects that may come after the client redirect.

Will this thread be updated/closed once the new feature is available?

For anyone interested you can get client-side redirects like this:

 page.on('framenavigated', frame => {
   if(frame.parentFrame() === null) {
     console.log(frame._url);
   }
 });
or
   page.on('framenavigated', frame => {
   if (frame === page.mainFrame())
     console.log(frame._url);
   })
0reactions
aslushnikovcommented, Dec 14, 2017

Will this thread be updated/closed once the new feature is available?

@leem32 all the future updates will take place in the #1579, please subscribe to be notified on future changes

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to Identify & Fix 5 Common Redirect Issues
A 302 redirect is a temporary redirect, and should be used when you anticipate the URL you're redirecting from coming back at some...
Read more >
Open redirect URLs: Is your site being abused?
Unfortunately there is no one easy way to make sure that your redirects aren't exploited. An open redirect isn't a bug or a...
Read more >
Open redirect vulnerability | Tutorials & examples - Snyk Learn
An open redirect vulnerability occurs when an application allows a user to control a redirect or forward to another URL. If the app...
Read more >
How to Solve This Webpage has a Redirect Loop Problem
Solution of "This Webpage has a redirect loop" or "Error 310 (net::ERR_TOO_MANY_REDIRECTS): there were too many redirects" in Chrome and Mozilla Firefox.
Read more >
Unvalidated Redirects and Forwards Cheat Sheet
Unvalidated redirects and forwards are possible when a web application accepts untrusted input that could cause the web application to redirect the request ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found