Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

some websites block puppeteer access and show blank ad page

See original GitHub issue

Steps to reproduce

Tell us about your environment:

Puppeteer version: 5.3.0
Platform / OS version: Ubuntu 18.04
URLs (if applicable):
Node.js version: v12.6.0

What steps will reproduce the problem?


var fs = require("fs");

const puppeteer = require('puppeteer-extra')
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
puppeteer.use(StealthPlugin())

const repl = require('puppeteer-extra-plugin-repl')({ addToPuppeteerClass: false })
puppeteer.use(repl)

var sleep = require('sleep');

(async () => {

  const browser = await puppeteer.launch({
    

    headless: false,
    ignoreHTTPSErrors: true,

    args: [
        '--lang=en-US,en;q=0.9', 
        // '--user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3419.0 Safari/537.36"', 
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-infobars',
        '--window-position=0,0',
        '--ignore-certifcate-errors',
        '--ignore-certifcate-errors-spki-list',
    ], 

    ignoreHTTPSErrors: true,
    userDataDir: './tmp', 

  });


  const page = await browser.newPage();


  const preloadFile = fs.readFileSync('./preload.js', 'utf8');
  await page.evaluateOnNewDocument(preloadFile);



  await page.setViewport({width: 1200, height: 720})

  await page.setDefaultNavigationTimeout(0);

  const navigationPromise = page.waitForNavigation()
  

  await Promise.all([
    page.goto('https://www.coingecko.com/en/coins/bitcoin', {timeout: 60000}), 
    page.waitForNavigation({ waitUntil: 'networkidle0' }),
  ]);
  await sleep.sleep(5)

  await Promise.all([
    page.goto('https://www.coingecko.com/en/coins/ethereum', {timeout: 60000}), 
    page.waitForNavigation({ waitUntil: 'networkidle0' }),
  ]);
  await sleep.sleep(5)


  await repl.repl(page)
  await sleep.sleep(5)

})();

What is the expected result?

normal behavior with no auto-redirection or such.

What happens instead?

some (very popular) websites (I tried tweetdeck.twitter.com; reddit.com; coingecko.com) block (not sure really, so tell me what this actually is) puppeteer browser access, and they show me some blank page with an ad instead, as example screenshot image shown below. more technically, they show me a normal webpage once for less than few seconds but after that they auto-redirect to a blank page. You can return back to the normal page by doing back button on your mouse manually and if that case you can see the normal webpage in that session. also, while that time, goto a URL to auto-redirected to a blank page, puppeteer was unable to scrape any document in the page.