question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Distinct options for every new browser instance

See original GitHub issue

Hi!

First of all: Very beautiful code and software. I should start learning typescript.

Is it possible to pass different options to different browser launches?

As I can see in the concurrency implementation of CONCURRENCY_BROWSER in src/concurrency/built-in/Browser.ts, every new browser is started with identical options:

let chrome = await this.puppeteer.launch(this.options) as puppeteer.Browser;

Would it be possible to pass different options to new launches of browser instances?

I ask because I want to set different --proxy-server=some-proxy flags to new browser launches.

Thanks for viewing

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:3
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

10reactions
NikolaiTcommented, Feb 27, 2019

Ok I managed to do this myself.

here is the test case:

const { Cluster } = require('./dist/index.js');

(async () => {

    let browserArgs = [
        '--disable-infobars',
        '--window-position=0,0',
        '--ignore-certifcate-errors',
        '--ignore-certifcate-errors-spki-list',
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-dev-shm-usage',
        '--disable-accelerated-2d-canvas',
        '--disable-gpu',
        '--window-size=1920x1080',
        '--hide-scrollbars',
        '--proxy-server=socks5://78.94.172.42:1080',
    ];

    // each new call to workerInstance() will
    // left pop() one element from this list
    // maxConcurrency should be equal to perBrowserOptions.length
    let perBrowserOptions = [
        {
            headless: false,
            ignoreHTTPSErrors: true,
            args: browserArgs.concat(['--proxy-server=socks5://78.94.172.42:1080'])
        },
        {
            headless: true,
            ignoreHTTPSErrors: true,
            args: browserArgs.concat(['--proxy-server=socks5://CENSORED'])
        },
    ];

    const cluster = await Cluster.launch({
        monitor: true,
        concurrency: Cluster.CONCURRENCY_BROWSER,
        maxConcurrency: 2,
        puppeteerOptions: {
            headless: false,
            args: browserArgs,
            ignoreHTTPSErrors: true,
        },
        perBrowserOptions: perBrowserOptions
    });

    // Event handler to be called in case of problems
    cluster.on('taskerror', (err, data) => {
        console.log(`Error crawling ${data}: ${err.message}`);
    });


    await cluster.task(async ({ page, data: url }) => {
        await page.goto(url, {waitUntil: 'domcontentloaded', timeout: 20000});
        const pageTitle = await page.evaluate(() => document.title);
        console.log(`Page title of ${url} is ${pageTitle}`);
        console.log(await page.content());
    });

    await cluster.queue('http://ipinfo.io/json');
    await cluster.queue('http://ipinfo.io/json');
    // many more pages

    await cluster.idle();
    await cluster.close();
})();

here is the diff:

diff --git a/src/Cluster.ts b/src/Cluster.ts
index c2ee9f0..23678f0 100644
--- a/src/Cluster.ts
+++ b/src/Cluster.ts
@@ -20,6 +20,7 @@ interface ClusterOptions {
     maxConcurrency: number;
     workerCreationDelay: number;
     puppeteerOptions: LaunchOptions;
+    perBrowserOptions: any;
     monitor: boolean;
     timeout: number;
     retryLimit: number;
@@ -42,6 +43,7 @@ const DEFAULT_OPTIONS: ClusterOptions = {
     puppeteerOptions: {
         // headless: false, // just for testing...
     },
+    perBrowserOptions: [],
     monitor: false,
     timeout: 30 * 1000,
     retryLimit: 0,
@@ -72,6 +74,8 @@ export default class Cluster extends EventEmitter {
     static CONCURRENCY_BROWSER = 3; // no cookie sharing and individual processes (uses contexts)
 
     private options: ClusterOptions;
+    private perBrowserOptions: any;
+    private usePerBrowserOptions: boolean = false;
     private workers: Worker[] = [];
     private workersAvail: Worker[] = [];
     private workersBusy: Worker[] = [];
@@ -139,7 +143,14 @@ export default class Cluster extends EventEmitter {
         } else if (this.options.concurrency === Cluster.CONCURRENCY_CONTEXT) {
             this.browser = new builtInConcurrency.Context(browserOptions, puppeteer);
         } else if (this.options.concurrency === Cluster.CONCURRENCY_BROWSER) {
+            this.perBrowserOptions = this.options.perBrowserOptions;
+            if (this.perBrowserOptions.length !== this.options.maxConcurrency) {
+                debug('Not enough perBrowserOptions! perBrowserOptions.length must equal maxConcurrency');
+            } else {
+                this.usePerBrowserOptions = true;
+            }
             this.browser = new builtInConcurrency.Browser(browserOptions, puppeteer);
+
         } else if (typeof this.options.concurrency === 'function') {
             this.browser = new this.options.concurrency(browserOptions, puppeteer);
         } else {
@@ -165,12 +176,17 @@ export default class Cluster extends EventEmitter {
         this.nextWorkerId += 1;
         this.lastLaunchedWorkerTime = Date.now();
 
+        var nextBroserOption = {};
+        if (this.usePerBrowserOptions && this.perBrowserOptions.length > 0) {
+            nextBroserOption = this.perBrowserOptions.shift();
+        }
+
         const workerId = this.nextWorkerId;
 
         let workerBrowserInstance: WorkerInstance;
         try {
             workerBrowserInstance = await (this.browser as ConcurrencyImplementation)
-                .workerInstance();
+                .workerInstance(nextBroserOption);
         } catch (err) {
             throw new Error(`Unable to launch browser for worker, error message: ${err.message}`);
         }
diff --git a/src/concurrency/ConcurrencyImplementation.ts b/src/concurrency/ConcurrencyImplementation.ts
index ce1a1bc..7550467 100644
--- a/src/concurrency/ConcurrencyImplementation.ts
+++ b/src/concurrency/ConcurrencyImplementation.ts
@@ -34,7 +34,7 @@ export default abstract class ConcurrencyImplementation {
     /**
      * Creates a worker and returns it
      */
-    public abstract async workerInstance(): Promise<WorkerInstance>;
+    public abstract async workerInstance(perBrowserOptions: any): Promise<WorkerInstance>;
 
 }
 
diff --git a/src/concurrency/built-in/Browser.ts b/src/concurrency/built-in/Browser.ts
index 9f29753..b3232a6 100644
--- a/src/concurrency/built-in/Browser.ts
+++ b/src/concurrency/built-in/Browser.ts
@@ -11,8 +11,8 @@ export default class Browser extends ConcurrencyImplementation {
     public async init() {}
     public async close() {}
 
-    public async workerInstance(): Promise<WorkerInstance> {
-        let chrome = await this.puppeteer.launch(this.options) as puppeteer.Browser;
+    public async workerInstance(perBrowserOptions: any): Promise<WorkerInstance> {
+        let chrome = await this.puppeteer.launch(perBrowserOptions || this.options) as puppeteer.Browser;
         let page: puppeteer.Page;
         let context: any; // puppeteer typings are old...
4reactions
lazybottercommented, Apr 2, 2019

Thanks for the great module!

Is this possible yet? I need to set a different http proxy per browser instance.

Could some kind of event not be fired beforeLaunch or something like that, then we can configure each browser/page instance.

This use case would not work for my application as I need to dynamically queue tasks every X mins to the cluster object that are fetched from a server.

Thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

Puppeteer launch multiple instances with unique data?
Basically, in one point of my script I tell the browser to close the tab and reopen a new one. It closes each...
Read more >
Opening Links in New Browser Windows and Tabs
New windows or tabs can cause disorientation, with users often not realizing that a new window or tab has opened.
Read more >
Observations running 2 million headless sessions
Each new browser instance gets a clean --user-data-dir (unless otherwise specified), which means it's treated as a fresh session entirely.
Read more >
How to Set Up A Different Proxy for Each Tab - Ghost Browser
You can select the 'Next proxy in list' option to tell Ghost to do this at the tab or Identity level. If you...
Read more >
Window.open() - Web APIs - MDN Web Docs
Users may use browser built-in features or extensions to choose whether to open a link in a new window, in the same window,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found