question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Remote Execution: Symlinks created by ctx.actions.symlink are not represented as symlinks remotely

See original GitHub issue

Description of the bug:

Using the remote execution capabilities of Bazel the behavior differs between a local and a remote execution.

When using ctx.actions.symlink() one would expect that the declared file is a symlink to some sort of source. On a local execution this is the case. On a remote execution the declare file is directly represented by the symlinked file.

Thus, the content of the file is correct, but the semantics of the symlink are gone. Build actions might rely on the semantics of a symlink and query, if a file is a symlink. This will lead to different results of local and remote executions which in worst case can cause a thread poisioning.

What’s the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

A minimum reproducible example has been created:

https://github.com/castler/buildbarn_bazel_symlink_issue_repro#how-to-reproduce-the-issue

Which operating system are you running Bazel on?

Linux

What is the output of bazel info release?

development version

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

Bazelisk using last_green.

Using commit: 6efc2ab2302f31dd522bcf955bf23cec4f1a95b5

What’s the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

I first thought its a Remote Execution Service issue, which was analyzed here: https://github.com/buildbarn/bb-remote-execution/issues/104

https://github.com/bazelbuild/bazel/issues/11119 talks about that symlinks are not cached remotely. Maybe that has some similar root-cause.

https://github.com/bazelbuild/bazel/issues/6547

https://github.com/bazelbuild/bazel/commit/666fce514b87e6901b033f3399e2d4b56a856429 seemed related to me, but sadly did not fix the underlying problem.

Any other information, logs, or outputs that you want to share?

No response

Issue Analytics

  • State:open
  • Created a year ago
  • Comments:8 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
tjgqcommented, Aug 9, 2022

@tjgq : My understanding is that I use the experimental feature ctx.actions.declare_symlink if I want to allow dangling symlinks that are created by an action. In my case I neither want that the action creates the symlink, nor that it can be dangling. Thus, given the current documentation, I would decide in any case for the combination of ctx.actions.declare_file with ctx.actions.symlink (also if it would work with remote execution and it would be stable).

Is my understanding correct?

ctx.actions.symlink with a ctx.actions.declare_symlink output can create both a dangling or a non-dangling symlink; whether it dangles depends on whether the target of the symlink exists. The only thing Bazel guarantees is that calling readlink on the symlink will return the target supplied to ctx.actions.symlink (irrespective of whether it was supplied as a File, via the target_file argument, or as a string, via the target_path argument). If an action were to receive as an input the symlink but not the file it points to, it would not be able to read the target file (under sandboxed conditions). This is why Bazel calls it an “unresolved” rather than a “dangling” symlink: Bazel won’t attempt to resolve it, but that doesn’t necessarily mean it won’t resolve successfully.

By contrast, when you call ctx.actions.symlink with a ctx.actions.declare_file output, Bazel guarantees that an action can observe the contents of the file at the other end of the symlink, even when the declare_file is the only input. The way I see it, the symlink is just an implementation detail; making a copy of the original file would be equivalent, albeit less efficient.

If so, I would like to understand this a little better:

For the second form of ctx.actions.symlink, you are correct that local execution materializes the ctx.actions.symlink output as a symlink in blaze-bin, while remote execution materialize it as a regular file.

Do we see this behavior as final? That’s how I interpret your answer.

I don’t have a strong opinion. One could argue that making the contents of bazel-bin dependent on the execution strategy is a bug. But Bazel already uses symlinks to stand in for real files in other places (again, consider sandboxed execution) so one could also say that the distinction between a symlink and the file it points to is not to be relied upon. I lean towards the latter position.

An important point I want to stress again, though, is that the contents of bazel-bin do not necessarily reflect Bazel’s internal state; a declare_file will be tracked internally as a regular file, irrespective of whether it materializes as a symlink.

So basically I read that symlink with with declare_file() does not guarantee to create a symlink. If that is our understanding I would propose to document that in the API and then close this issue.

Yes, I agree that the documentation is unclear; I was confused myself until it was pointed out to me that ctx.actions.symlink does two different things. I will send a PR to improve it.

For future use-cases and guaranteed symlinks people then shall use declare_symlink with symlink.

I agree, although sadly you don’t currently have that option if you wish to use remote execution (due to #10298).

Finally, I should note that it’s a bad idea for actions to be sensitive to whether an input is a file or a symlink to said file, since sandboxed execution (which works by executing the action inside a symlink tree) would affect their outcome.

I agree with you, I anyhow understood from the current point of documentation, that a symlink creation is guaranteed.

@fmeum : Yes, thank you for the hint to #10298

0reactions
tjgqcommented, Oct 20, 2022

I’ve just submitted 32b0f5a, which will cause non-symlink outputs created via ctx.actions.symlink (i.e., with a target_file parameter) to be materialized on the local filesystem as symlinks when --remote_downloads_minimal is enabled. This should avoid duplicate downloads of the same object when multiple symlink target it. (Except possibly on Windows, of course - I haven’t had a chance to look into it yet.)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Symbolic Links (GNU Findutils 4.9.0)
GNU find will handle symbolic links in one of two ways; firstly, ... links listed on the command line are dereferenced, but other...
Read more >
Command-Line Reference | Bazel
They will not traverse directories or be sensitive to symlinks. ... Bazel will represent symlinks in action outputs in the remote ...
Read more >
Why doesn't my symbolic link work? - Super User
In essence, a symbolic link is a file that contains a filename/pathname for ... If setting this remotely, be sure to add a...
Read more >
Link Shell Extension (LSE)
The extension allows the user to select one or many files or folders, then using the mouse, complete the creation of the required...
Read more >
remotebuildexecution - godocs.io
import "google.golang.org/api/remotebuildexecution/v2" ... ctx := context. ... The output paths of // the action that are symbolic links to other paths.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found