Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Average impact

See original GitHub issue

Hey 👋 Thank you again for this insightful project!

I’ve been contemplating usefulness of the ‘average impact’ metric. As far as I understand from readme, origin’s impact is not aggregated at the website level e.g. if are two different scripts from ads.example running on news.example and each of them takes 50ms of total CPU time, average is calculated as 50 + 50 / 2 requests (=50ms) and not 50 + 50 / 1 website (=100ms). Because of that, current method is rewarding origins that split their work between multiple scripts which doesn’t seem fair to me. Grouping execution time per origin (or better yet, tld or entity) at the website level would significantly change order of entities in the “Third Parties by Category” section [1].

WDYT about such change? Did I miss something in my deliberations? Is such grouping feasible with current dataset? Or would it make more sense to take a step back and create a new Lighthouse audit that groups total CPU time by entity (perhaps using entity data from this project)?

[1] Average number of scripts per website for chosen, high prevalence, origins (those numbers come form my own small crawl):

script.hotjar.com - 1 (avg per resource - 19ms, avg per site - 19ms) code.jquery.com - 1.14 www.googletagservices.com - 2.67 s7.addthis.com - 3.0 platform.twitter.com - 3.2 www.youtube.com - 3.38 t.sharethis.com - 5.05 www.facebook.com - 5.77 (avg per resource - 10.9ms, avg per site - 63ms)

Issue Analytics

State:
Created 4 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

2reactions

rviscomicommented, Apr 29, 2019

So would the aggregation be something like the median sum of execution times per page per entity?

Also FYI I added the map of domains -> entities to a BigQuery table in the HTTP Archive project: https://bigquery.cloud.google.com/table/httparchive:scratchspace.third_parties?pli=1&tab=preview. It’s not kept in sync with this repo but that might be a good way to do everything in SQL.

1reaction

patrickhulcecommented, Apr 29, 2019

Isn’t entity list basically a domain → entity name mapping? Could this be a simple, two column, table that you could join?

That’d get us most of the way but there’s fancy root domain logic that happens in there too. I suppose we could just generate a dump of all observed origins -> entity name pre-resolved