Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Separate HAR file and consecutive load error

See original GitHub issue

So this whole issue stems from the fact I can’t do a require('chrome-har-capturer').load(large_list); as I have found there becomes a point where it can’t handle the buffer to write the HAR output and I have a list of like 1000 sites I need to HAR scrape. Since I can’t use a long array the hack around is to use recursion like this

var fs = require('fs');
var chc = require('chrome-har-capturer');
var list = ["https://github.com", "https://www.reddit.com"];

function loadSite(i) {
    if (i >= list.length) return; //ends recursion

    var c = chc.load(list[i]);

    c.on('connect', function () {
        console.log("Connected to Chrome: " + i);
    });
    c.on('end', function (har) {
        console.log("Done: " + i);

//	loadSite(++i); // uncomment to see Invalid tab index

//	setTimeout(function(){ loadSite(++i) }, 100); // uncomment to see it NOT have error

    });
    c.on('error', function (err) {
        console.error("Cannot connect to Chrome: " + err);
    });
}

// Kicks off recursion
loadSite(0);

The issue is chrome takes like 50ms when opens a new tab to populate the devtoolsFrontendUrl and webSocketDebuggerUrl properties.

I guess the real question is there is two ways of dealing with this

Most people are probably not going to need to scrape hundreds of sites at once and can just parse with the page reference when passed an array and leave the setTimeout as a unique case hack
It would be nice to have separate HAR files in general if you had like just 5 separate sites and didn’t want them all merge. So maybe have a parser built in to prevent people from keep making their own

Have a quick (50-100ms) check again feature in the check for the webSocketDebuggerUrl
- Something where it calls again before throwing the error
- I notice that the NPM module isn’t updated so the current issue is fetchDebuggingUrl function

I would be happy to make the changes and submit the PR, but wanted to get your opinion on the subject first

Issue Analytics

State:
Created 6 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

sjfrickecommented, Apr 20, 2017

I ran the huge list again, granted it was ~250MB large of a file it worked. It was probably the fact my old machine was a Raspberry Pi 3 and that’s a solid 1/4 of the RAM right there.

0reactions

cyrus-andcommented, Apr 19, 2017

OK, I’m really interested in figuring out why such error would happen. Thanks for your time!