question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance idea: `--sf`/`--slow-first` option to improve resource utilization

See original GitHub issue

Reading this blog post about Stripe’s test runner made me think we should have a --slow-first option for xdist, and it seems that we don’t yet 😅 The motivation for --slow-first is that fastest-tests-last is a great heuristic to reduce the duration at the end of a test run when some processes are done but others are still running - which can range from negligible to “several times longer than the rest of the run” (when e.g. I select mostly fast unit tests, plus a few slow integration tests which happen to run last).

IMO this should be lower priority than --last-failed, only reorder passing tests for --failed-first, and be incompatible with --new-first (existing flag docs). The main trick is to cache durations from the last run, and then order by the aggregate time for each loadscope (i.e. method, class, or file, depending on what we’ll distribute - pytest-randomly is useful prior art).

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:7
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

6reactions
Zac-HDcommented, May 10, 2021

One detail however is that this doesn’t need to be implemented in xdist at all, any plugin which reorders tests using pytest_collection_modifyitems would work, as xdist would then see the reordered list, and schedule tests accordingly.

It could be implemented elsewhere, but the --slow-first ordering only improves performance if you’re running tests in parallel, so I think xdist is the most sensible place for it. It could even make single-core performance worse, e.g. in combination with -x/--exit-first. For best results --slow-first also needs to know the current value of xdist’s --dist argument.

For example, take a test suite with five 1s tests in file A, a single 3s test in file B, and two 3s tests in file C; and assume that we have two cores.

  • With --dist=load
    • Currently, we’ll have core1 run A1 A3 A5 C1=6s and core2 run A2 A4 B C2=8s
    • --slow-first would have core1 run B C2 A4=7s and core2 run C1 A1 A2 A3 A5=7s (speedup!)
  • With --dist=loadfile
    • Currently, we’ll have core1 run A=5s and core2 run B C=9s
    • --slow-first would have core1 run C=6s and core2 run A B=8s (speedup!)
  • --dist=each is of course equivalent to single-core, so no benefit from --slow-first

So on this toy model we get a 16% wall-clock speedup just from better task ordering!

In the real world, I have twelve cores and Hypothesis’ 2500 cover tests take ~70s with the slowest ten tests taking 5-15s each; the 500 nocover tests take ~35s with the slowest ten taking 5-19s each (and yes we’ve taken the low-hanging perf fruit). Anecdotally, it’s pretty obvious towards the end that things are slowing down and a few cores are idling, and I’d expect a similar 10%-20% wall-clock improvement.

0reactions
nicoddemuscommented, Dec 12, 2022

Awesome @klimkin, thanks for sharing!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Nine Ways to Improve Resource Utilization in the next 30, 60 ...
This article looks at specific tips for using Professional Services Automation (PSA) software to improve resource utilization over 30, 60 and 90 days....
Read more >
5 Tips for Improving Resource Utilization | Kantata Software
Tip 3: Evaluate best utilization options and explore undiscovered opportunities. Leverage free tools to assess resource utilization and explore ...
Read more >
Resource Utilization and 5 Ways to Maximize it within your ...
We look at how you can effectively manage your resources, and an in depth look at resource utilization & resource allocation.
Read more >
Top 12 Resource Management Best Practices - Planview
See how great resource management software and following resource management best practices can lead to significant improvements in your organization.
Read more >
Best Practices To Measure and Improve Resource Utilization
Learn about resource utilization in project management, review the formula to calculate this metric and discover best practices for boosting ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found