How to use pool of tabs
See original GitHub issueThis is more of a how to question than a issue. Lets assume the scenario of generating screenshots of webpages concurrently. My thought was to create a pool of tabs like below and utilize the workers to get screenshots.
//headless-page-pool
let CDP = require('chrome-remote-interface');
var genericPool:any = require('generic-pool');
const sandBoxFactory = {
create:async function () {
try {
let tabMeta = await CDP.New({ remote : true });
let client = await CDP({ tab : tabMeta})
client._target = tabMeta;
return client;
} catch (e) {
console.error(e);
}
},
validate: function(client:any) {
//TODO: Find a way to validate dev tools connection
},
destroy: async function (client:any){
try {
return await client.Target.closeTarget(client._target.id);
} catch (e) {
console.error(e);
}
}
}
var opts = {
max: 10, // maximum size of the pool
min: 3 // minimum size of the pool
}
let WorkerPool:any;
function getWorkerPool() {
return WorkerPool;
}
function createWorkerPool() {
console.log('creating pool');
WorkerPool = genericPool.createPool(sandBoxFactory, opts);
WorkerPool.on('factoryCreateError', function(err:any){
console.error(err);
});
WorkerPool.on('factoryDestroyError', function(err:any){
console.error(err);
});
return WorkerPool
}
function getWorkerFromPool() {
return WorkerPool.acquire()
}
export { getWorkerPool, createWorkerPool, getWorkerFromPool };
This creates 3 tabs in chrome and in my request handler
import {createWorkerPool, getWorkerPool, getWorkerFromPool } from './headless-page-pool';
import fs = require('fs');
createWorkerPool();
export async function captureScreenshot(url:string,
options:any, timeout=10000):Promise<any> {
try {
return new Promise(async function(resolve, reject) {
let WorkerPool = getWorkerPool();
let worker = await getWorkerFromPool();
await worker.Page.enable();
worker.Page.loadEventFired(async function() {
console.log('page loaded');
let result = await worker.Page.captureScreenshot(); //This line never resolves since the target is not active
resolve(Buffer.from(result.data, 'base64'));
});
await worker.Page.navigate({ url });
WorkerPool.release(worker);
});
} catch (error) {
console.error(error);
}
}
The problem is await worker.Page.captureScreenshot();
never resolves because the tab is not active. Just click on the tab containing the url in chrome it just resolves.
This can be workaround by calling
worker.Target.activateTarget({ targetId : worker._target.id})
After doing this I just get the blank screen as image unless I put a pause which gives the actual image. The bottom line is whatever I do there is no way to process multiple images simultaneously because only the active tab can process the image, So how do we use multiple tabs and process images in all the tabs concurrently. Is this a bug in chrome which does not allow to capture screenshots when the tab is not active or I am missing something.
Let me know if I am not clear.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:14 (6 by maintainers)
Top GitHub Comments
It seems that you’re right, the tab must be exposed during the screenshot phase. As you say, activating the target on page load won’t work probably because another page load event steals the focus before the screenshot is completed thus producing empty/partial images.
Luckily though it’s the page load phase which is usually time consuming and that part can be fully parallelized by spawning multiple tabs/targets. Try this, instead of calling
Page.captureScreenshot
as soon as the page is loaded, simply enqueue the task to a common array (or use promises, see below), then when all the pages in the batch are finished start the serial activate-screenshot-repeat phase.Here’s what I mean:
Please let me know if this can work for you.
@pthieu I used
client.Target.closeTarget(client._target.id);
not sure if that is the right way but it does close the browser tab.