Separate HAR file and consecutive load error
See original GitHub issueSo this whole issue stems from the fact I can’t do a require('chrome-har-capturer').load(large_list);
as I have found there becomes a point where it can’t handle the buffer to write the HAR output and I have a list of like 1000 sites I need to HAR scrape. Since I can’t use a long array the hack around is to use recursion like this
var fs = require('fs');
var chc = require('chrome-har-capturer');
var list = ["https://github.com", "https://www.reddit.com"];
function loadSite(i) {
if (i >= list.length) return; //ends recursion
var c = chc.load(list[i]);
c.on('connect', function () {
console.log("Connected to Chrome: " + i);
});
c.on('end', function (har) {
console.log("Done: " + i);
// loadSite(++i); // uncomment to see Invalid tab index
// setTimeout(function(){ loadSite(++i) }, 100); // uncomment to see it NOT have error
});
c.on('error', function (err) {
console.error("Cannot connect to Chrome: " + err);
});
}
// Kicks off recursion
loadSite(0);
The issue is chrome takes like 50ms when opens a new tab to populate the devtoolsFrontendUrl
and webSocketDebuggerUrl
properties.
I guess the real question is there is two ways of dealing with this
- Most people are probably not going to need to scrape hundreds of sites at once and can just parse with the page reference when passed an array and leave the
setTimeout
as a unique case hack - It would be nice to have separate HAR files in general if you had like just 5 separate sites and didn’t want them all merge. So maybe have a parser built in to prevent people from keep making their own
- Have a quick (50-100ms) check again feature in the check for the webSocketDebuggerUrl
- Something where it calls again before throwing the error
- I notice that the NPM module isn’t updated so the current issue is fetchDebuggingUrl function
I would be happy to make the changes and submit the PR, but wanted to get your opinion on the subject first
Issue Analytics
- State:
- Created 6 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
Generating HAR files and analyzing web requests
Go to Tools > Web Developer > Network · Click the cog icon, then Persist logs · Refresh the page to start capturing...
Read more >Generating a HAR file for troubleshooting - Zendesk help
Open Chrome and go to the page where the issue is occurring. Look for the vertical ellipsis button and select More Tools >...
Read more >Generating a HAR file · Issue #42 · chromedp ... - GitHub
I'm looking through the code trying to figure out how to generate a HAR archive (with Chrome) for a given URL although I...
Read more >HAR converter - Grafana k6
The HAR converter is an alternative to the Browser recorder. It generates a k6 script based on the HTTP requests included on a...
Read more >How to Collect a HAR File and HTTP Error Logs Using Chrome
Set up logging. Open Google Chrome; From the Chrome Click F12 or right-click > Inspect · Generate a HAR file. Select the Network...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I ran the huge list again, granted it was ~250MB large of a file it worked. It was probably the fact my old machine was a Raspberry Pi 3 and that’s a solid 1/4 of the RAM right there.
OK, I’m really interested in figuring out why such error would happen. Thanks for your time!