question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"Upload missing inputs" performance regression in Bazel 5.0 and 5.3

See original GitHub issue

Description of the bug:

cc @coeuvre – this bug is very similar to #15872, but the regression actually occurred in the 5.0.0 release. I did a git bisect between 41feb616ae and 2ac6581, and it appears https://github.com/bazelbuild/bazel/commit/db15e47d0391d904c29e6e5c632089e2479e62c2 is the source of this slower behavior.

In the actual repo affected by this, we have ~600k inputs to some actions, which are taking nearly 7 seconds on this step with a recent release-5.3.0 commit. (We thought the fix to #15872 may have helped here, but it is consistent with 5.0’s performance, and still slower than 4.2.)

image

Unfortunately this overhead was much smaller (<1s) with Bazel 4.2.2, and this regression, while not the same magnitude as that of #15872, is still significant for us.

What’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Using the same test repo, https://github.com/clint-stripe/action_many_inputs/, but with 300,000 inputs to the action (edit the number in WORKSPACE to change this):

I have a script that runs the same set of commands, which reproduces this very clearly: (In the action_many_inputs repo, with bazel checked out and built in a sibling directory)

$ ../bazel/bazel-bin/src/bazel-dev test ... --config=remote --remote_download_toplevel --profile=/tmp/bazel-bisect.profile

Run this once, to ensure that the inputs have all been uploaded – we want to measure the time when no action inputs change.

Then, run this a few times (it’s ok if the test doesn’t actually run; I usually hit ctrl-c after ~10 seconds just to ensure the “upload missing inputs” step is complete), and check the timing in the profile.

If it’s helpful, you can look at just the event we care about:

cat "$PROFILE_PATH" | jq '.traceEvents[] | select(.name == "upload missing inputs") | .dur = (.dur/1e3)'
commit run # duration (ms)
db15e47d0391d904c29e6e5c632089e2479e62c2 1 2665
db15e47d0391d904c29e6e5c632089e2479e62c2 2 2513
db15e47d0391d904c29e6e5c632089e2479e62c2 3 2425
db15e47d0391d904c29e6e5c632089e2479e62c2 4 2484
3ada002c4fc690630eb6ce82b2c06bd0cc0bdda2 1 967
3ada002c4fc690630eb6ce82b2c06bd0cc0bdda2 2 679
3ada002c4fc690630eb6ce82b2c06bd0cc0bdda2 3 660
3ada002c4fc690630eb6ce82b2c06bd0cc0bdda2 4 560

Which operating system are you running Bazel on?

linux

What is the output of bazel info release?

No response

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

~/.bazel_binaries/bazel-5.0.0/bin/bazel build src:bazel-dev

What’s the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

I still have the full profiles from most of these bazel invocations, happy to share if there’s anything else there. (Unfortunately these changes predate the more granular profiling that distinguishes between ‘collect digests’ and ‘find missing digests’.)

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
clint-stripecommented, Aug 5, 2022

@brentleyjones it’s not – that regression took these tests from ~2s to ~45s, but it’s still several times higher than in 4.2. (I ran a few of the same tests on release-5.3.0 branch at 9d57003f8d5735eda9bd6207c51f0e9faa6c797a, with about the same results.)

0reactions
morotencommented, Aug 17, 2022
  • It might be more expense to build the digest map even when the cache is disabled.

The execution time increase I would expect is due to the recursive visitor pattern when building the Merkle tree. Now, that pattern is already part of the old MerkleTree build, so for this case I assume it the Merkle tree caching should not add any processing time. Looking at a real measurement would of course of interesting.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Bazel 5.3
It is fully backward compatible with Bazel 5.0 and contains ... Remote: Fix performance regression in "upload missing inputs" (15890) ...
Read more >
Bazel 5.3.0 release candidate 2 is available for testing
Bazel 5.3.0rc2 is now available for those that want to try it out. ... Remote: Fix performance regression in "upload missing inputs".
Read more >
Not possible to select() on constraint_value in alias rules
When running bazel build :foo , the following error is produced: ... Remote: Fix performance regression in "upload missing inputs".
Read more >
How Bazel 5.0 Makes Your Builds Faster - BuildBuddy
On the output side of things, if multiple actions produced the same output, then Bazel would upload the output multiple times. If these...
Read more >
CHANGELOG.md - bazel - Git at Google
See https://github.com/bazelbuild/bazel/issues/16654 for details. ... Remote: Fix performance regression in "upload missing inputs". (#15998).
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found