Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

RFC: Add support for Performance Budgets

See original GitHub issue

Goal: Let’s make performance budgets front-and-center in your developer workflow

Performance budgets enable shared enthusiasm for keeping a site’s user experience within the constraints needed to keep it fast. They usher a culture of accountability that enable stakeholders to weigh the impact to user-centric metrics of each change to a site.

If you’re using Lighthouse locally (via DevTools) or in CI, it should be clear when you’re stepping outside your team’s agreed on perf budgets.

UX mocks we’re evaluating

See the complete set of UX mocks

screen shot 2018-09-09 at 10 09 32 pm screen shot 2018-09-09 at 10 10 05 pm

Setting budgets

A budget can be set for one or more metrics and will tell the team that the metric can’t go above or below a certain amount.

Referencing work by Tim Kadlec, metrics for perf budgets can include:

Milestone timings: timings based on the user-experience loading a page (e.g Time-to-Interactive). You’ll often want to pair several milestone timings to accurately represent the complete story during page load.
Quantity based metrics : based on raw values (e.g. weight of JavaScript (KB/MB)). These are focused on the browser experience.
Rule-based metrics: scores generated by tools such as Lighthouse or WebPageTest. Often, a single number or series to grade your site.

Teams who incorporate budgets into their workflow will often have CI warn or error a build if a PR regresses performance.

Budgets could vary by whether you’re in production or dev, target device class (desktop/mobile/tablet) or network conditions. We should evaluate to what extent developers will need/want to differenciate between these considerations when setting a budget.

Some examples of budgets

Our home page must load and get interactive in < 5s on slow 3G/Moto G4 (lab)
Our product page must ship less than 150KB of JavaScript on mobile
Our search page must include less than 2MB of images on desktop

Budgets can be more or less specific and the thresholds applied can vary based on target networks and devices.

While Lighthouse already provides some default thresholds for metrics like Time-to-Interactive or large JavaScript bundles, a budgeting feature enables teams to adapt budgets to fit their own needs.

Where should budgets be incorporated? (strawman)

Please note this is just a rough sketch for what an integration could look like

Potential touch points for rolling out a performance budgeting solution:

Chrome DevTools: Support for reading a budget.js/json file for a project. Workspaces could detect the presence of such a file and ask you if you want it applied for the project. Highlight where useful that budgets have been crossed (e.g. Lighthouse, Network panel)
Lighthouse: We have many options here, some of which are mocked out lower down. One is setting a budget via a config file (per above). Your Lighthouse report highlights any time metrics/resource sizes cross that budget.
Lighthouse CI: Reads a provided budget.json/budgets.js which will fail the build when supplied budgets are crossed. (e.g TTI < 5s, total JS < 170KB). Work towards this being easy to adopt for use with Travis/GitHub projects.
Third-party tooling?: Encourage adoption of budget.js/json in popular tools (webpack, SpeedCurve, Calibre, framework tooling, CMS)
Potentially in RUM dashboards, although we might want to consider what this means… e.g. TTI vs FID budgets

Defining a performance budget

Ultimately, we should leave this up to teams but give them some strong defaults.

Walking back from Alex Russell’s “Can You Afford It?: Real-world performance budgets”, this may be:

Time-To-Interactive < 5s on Slow 3G on a Moto G4 (~4-5x CPU throttling)
JavaScript budget of < 170KB if targeting mobile.
Budgets for other resources can be drawn from a total page weight target. If a page cannot be larger than 200KB, your budget for images, JS, CSS, etc will need to fit in.

Budget targets and the resource constraints that drop out of them will heavily depend on who your end users are. If you’re attempting to be interactive pretty quickly on a low-mid end device, you can’t be shipping 5MB of JavaScript.

budgets.js/budgets.json

If we were to opt for a budgets.js/json style configuration file, this could look something like the following (strawman):

module.exports = {
    {
        "preset": "mobile-slow-3g",
        "metrics": {
            "time-to-interactive": {
                "warn": ">5000",
                "error": ">6000"
            },
            "first-contentful-paint": {
                "warn": ">2000",
                "error": ">3000"
            }
        },
        "sizes": {
            "javascript": {
                "warn": ">170",
                "error": ">300"
            },
            "images": {
                "warn": ">500",
                "error": ">600"
            }
        }
    },
    {
        "preset": "desktop-wifi",
        "metrics": {
            "time-to-interactive": {
                "warn": ">3000",
                "error": ">4000"
            },
            "first-contentful-paint": {
                "warn": ">1000",
                "error": ">2000"
            }
        },
        "sizes": {
            "javascript": {
                "warn": ">700",
                "error": ">800"
            },
            "images": {
                "warn": ">1200",
                "error": ">1800"
            }
        }
    }
}

However, we should be careful not to tie ourselves too tightly to this as variance/thresholds may change how we think about configuration. Paul and Patrick have done some great research about presets and we could lean more into that as needed.

Metrics threshold considerations

Lighthouse runs may vary, especially across different machines.

As we explore performance budgeting, we should consider what impact this may have on how users set the thresholds they use for budgeting. Some options as we’ve talked to projects like lighthouse-thresholds:

Option 1: Multiple LH runs (e.g. “runs”: 3 in config)

Pros

Easy to implement.
Easy to understand what’s occurring for consumers.
Could calculate the Median Absolute Distribution, accounting for outliers.

Cons

Very small (non-significant) sample size.
More likely to be viewed as failing on network issues rather than site-specific issues. (ie. the I’ll-retry-this-build-until-it-works solution)
Multiple runs on the same machine will still suffer from the same network problems.
CPU and time in CI environments (fought so hard for low build times, shame to let that increase significantly)

Option 2: Define thresholds in ranges

e.g. “first-contentful-paint”: { “threshold”: 1000, “deviation”: “15%” }

Pros

Results could be compared with user-defined deviation allowances.
Results could fail if a results has deviated from the budget more than x%.

Cons

Consumers are unlikely to want to set both an upper & lower bound (eg. why would it matter if TTI was really low?).
A range definition could lead to a perception of unreliability.

Option 3: Generate measurement data and commit that to code

e.g. A user runs lighthouse-thresholds measure which triggers 10 or more LH runs, performed locally and results/data saved to a file. That file could then be committed and used to compare runs in CI/PRs against. If budgets are updated, so is the generated file.

Pros

Would take more time to implement.
Allows for a large sample size of measurements.
More accurate calculation of Mean Absolute Deviation
Opportunity to calculate and use standard deviations.
Site is measured and budgeted against real data.
Allows for more complicated statistical analysis like a Mean-Variance Analysis
This file would be a lot like test snapshot files, so devs are already used to seeing things like this.
In the absence of this file it could just revert to doing a regular threshold check.

Cons

More complicated to communicate.
Requires an accurate production run on a local machine (or elsewhere) - though could just be run against actual production locally.
Another file to maintain and keep in source code along with the budget file.

Option 4: Distributed runs . e.g. LH is run on multiple machines multiple times

Pros

Much more accurate performance representation.
Network issues are eliminated (or at least reduced).
Much more likely to be seen as reliable

Cons

Difficult to implement
Lots of http requests for a distributed solution = slow build?
Resource intensive, possibly expensive
CI machines have gigabit networks, performance is likely a lot better than real-world - does it matter?

Options 5: be smarter about how the metrics are computed e.g. Run LH a small number of times but be smarter about how that data is measured against thresholds.

Pros

Can more accurately filter outliers.
Can more accurately create a measure of Statistical Dispersion

Cons

More difficult to communicate to consumers.
Still such a small sample size.
Statistics aren’t magic.

How is this beneficial to Ligthhouse?

This would allow Lighthouse to reduce the friction for developer adoption of performance budgets, helping more sites hit a decent Lighthouse performance score.

Are you willing to work on this yourself?

Yep. I’m happy to own overall technical direction for budgeting and have asked @khempenius (implementation), @developit (framework integrations) and @housseindjirdeh (advisory) to work on this too.

Existing solutions supporting performance budgeting

SpeedCurve’s perf budgeting and alerts
Calibre supports performance budgeting
Performance budget builder by Brad Frost
Lighthouse CI with target scores
Lighthouse Thresholds - (pass or fail based on thresholds for audit categories)
pwmetrics by Paul Irish
Webpack performance budgets feature
bundlesize for tracking JavaScript bundle sizes
PerformanceBudget.io

Issue Analytics

State:
Created 5 years ago
Reactions:42
Comments:19 (11 by maintainers)

Top GitHub Comments

2reactions

tkadleccommented, Oct 3, 2018

FINALLY had a look at this—super excited about it!

Echoing @housseindjirdeh, I love the idea of budgets.js/json becoming something that could be used across other tools.

Have we considered always showing default budgets to LH DevTools/CI where they can only be over-written by adding a config file? Not saying this approach is necessarily better, but it’s a thought.

I like this too. Will likely lead to more people using performance budgets (and budgets.json). It will also help provide some general awareness and a starting point for folks.

I may be overthinking it, but couldn’t threshold accomplish the same thing as warn/error? I’m picturing folks setting a budget and then LH warns if it’s between the budget and the threshold and errors if it exceeds the threshold.

2reactions

housseindjirdehcommented, Sep 20, 2018

This is exciting! 🎉🎉🎉

Some initial thoughts (feel free to ignore):

+1 to providing strong defaults and for possibly exploring budgets.js/json adoption with other tools.
Do we need to separate for DEV/ PROD? I would assume developers would only really care about how well their site runs in production. 🤔
Definitely see the benefit of differentiating for device/network and hope we can allow for this but still make the configuration process as easy as possible.
It looks like this is leaning towards an opt-in approach. Have we considered always showing default budgets to LH DevTools/CI where they can only be over-written by adding a config file? Not saying this approach is necessarily better, but it’s a thought.
If we go with opt-in, have we thought about providing some indicator to the UI? We would want to let as many developers know that this is something they can (and should) use. For example, a message/toast/etc that says something like “Provide a budgets.json file to set performance budgets! Click here to see more information.”
Would this work with the Chrome Extension?