question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Empty responses on second page visit in domain with setRequestInterception(true) on some SPA pages

See original GitHub issue

Steps to reproduce

Tell us about your environment:

  • Puppeteer version: 5.3.1
  • Platform / OS version: Windows 10
  • Node.js version: 12.16.3

What steps will reproduce the problem?

Attempting to use page.goto() on sites built in the sapper framework leads to null responses on second page visits, see sample code fetching from 2 different sapper sites.

const puppeteer = require('puppeteer');

(async() => {
    const browser = await puppeteer.launch()
    const page = await browser.newPage()
    await page.setRequestInterception(true);
    page.on('request', async interceptedRequest => {
        try {
            await interceptedRequest.continue();
        } catch (err) {
            console.log(err);
        }
    });

    let r = null
    console.log('Site 1 first page...')
    r = await page.goto('https://tamethebots.com/', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // returns as expected
    console.log('Site 1 second page...')
    r = await page.goto('https://tamethebots.com/services', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // always returns null
    console.log('Site 2 first page...')
    r = await page.goto('https://formvalidation.io/', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // returns as expected
    console.log('Site 2 second page...')
    r = await page.goto('https://formvalidation.io/guide/validators', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // always returns null
    browser.close();
})();

What is the expected result?

main resource response is returned for all four page.goto() calls

What happens instead? As per the comments, the second page.goto() for each site is always null, with no errors caught for page.goto() This only happens if await page.setRequestInterception(true); is set.

I can’t seem to find reason as to why this would happen?

EDIT: looking at server logs (the first site is mine) there is a request for the second page, and associated resources, which received a 200 status.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:9

github_iconTop GitHub Comments

1reaction
ntgussonicommented, Dec 27, 2021

OK, so this appears to be service worker related, adding: await page._client.send('Network.setBypassServiceWorker', { bypass: true }); solves the issue, here in context of the example:

const puppeteer = require('puppeteer');

(async() => {
    const browser = await puppeteer.launch()
    const page = await browser.newPage()
    await page._client.send('Network.setBypassServiceWorker', { bypass: true });
    await page.setRequestInterception(true);
    page.on('request', async interceptedRequest => {
        try {
            await interceptedRequest.continue();
        } catch (err) {
            console.log(err);
        }
    });

    let r = null
    console.log('Site 1 first page...')
    r = await page.goto('https://tamethebots.com/', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // returns as expected
    console.log('Site 1 second page...')
    r = await page.goto('https://tamethebots.com/services', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // now returns as expected
    console.log('Site 2 first page...')
    r = await page.goto('https://formvalidation.io/', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // returns as expected
    console.log('Site 2 second page...')
    r = await page.goto('https://formvalidation.io/guide/validators', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // now returns as expected
    browser.close();
})();

await page._client.send('Network.setBypassServiceWorker', { bypass: true });

worked in my case. Thanks!

1reaction
dwsmartcommented, Jan 27, 2021

OK, so this appears to be service worker related, adding: await page._client.send('Network.setBypassServiceWorker', { bypass: true }); solves the issue, here in context of the example:

const puppeteer = require('puppeteer');

(async() => {
    const browser = await puppeteer.launch()
    const page = await browser.newPage()
    await page._client.send('Network.setBypassServiceWorker', { bypass: true });
    await page.setRequestInterception(true);
    page.on('request', async interceptedRequest => {
        try {
            await interceptedRequest.continue();
        } catch (err) {
            console.log(err);
        }
    });

    let r = null
    console.log('Site 1 first page...')
    r = await page.goto('https://tamethebots.com/', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // returns as expected
    console.log('Site 1 second page...')
    r = await page.goto('https://tamethebots.com/services', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // now returns as expected
    console.log('Site 2 first page...')
    r = await page.goto('https://formvalidation.io/', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // returns as expected
    console.log('Site 2 second page...')
    r = await page.goto('https://formvalidation.io/guide/validators', { timeout: 20000, waitUntil: 'networkidle0' }).catch(e => console.error(e));
    console.log(r) // now returns as expected
    browser.close();
})();
Read more comments on GitHub >

github_iconTop Results From Across the Web

Page.setRequestInterception() method - Puppeteer
Activating request interception enables HTTPRequest.abort(), HTTPRequest.continue() and HTTPRequest.respond() methods. This provides the capability to ...
Read more >
setRequestInterception(false) returns an error - Stack Overflow
I trying to use page.setRequestInterception(true); for some sites and then turn off for other sites. The example below starts page.
Read more >
try - Mercurial - Mozilla
setRequestInterception should cooperatively respond by priority (requestinterception-experimental.spec.ts)": [ + "FAIL" + ], + "request interception Page.
Read more >
Web Scraping With a Headless Browser: Puppeteer - ScrapFly
Puppeteer and nodejs tutorial (javascript) for web scraping dynamic web pages and web apps. Tips and tricks, best practices and example ...
Read more >
puppeteer使用- OSCHINA - 中文开源技术交流社区
它还可以配置为使用完整的(非headless)Chrome。 在浏览器中手动完成的大多数事情都可以通过使用Puppeteer 完成,下面是一些入门的例子: 生成屏幕截图和PDF 页面检索SPA ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found