question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pipes appear to not pass on EOF on macOS runners

See original GitHub issue

Describe the bug

(first reported at https://github.com/actions/virtual-environments/discussions/2352)

Using yes | some_script anywhere in a workflow on macOS hangs forever. some_script will terminate but yes will keep going, blocking the rest of the workflow.

To Reproduce

I’ve made a minimal test case here: https://github.com/kousu/hanging-actions/. It basically just tests:

yes | head -n 7

plus collecting some context for debugging.

To reproduce, just fork my repo and watch its Actions tab. Linux (and Travis!) will pass easily, but Actions on macOS hangs until cancelled.

Expected behavior All platforms should succeed in approximately the same time and way.

Runner Version and Platform

From my logs:

Request a runner to run this job

  • Current runner version: ‘2.275.1’

Operating System

  • Mac OS X
  • 10.15.7
  • 19H15

Virtual Environment

Thanks to @maxim-lobanov we know that it only happens under actions/runner, and not under https://github.com/microsoft/azure-pipelines-agent (even though azure pipelines runs the same macOS images) or when connected over VNC or ssh.

What’s not working?

The symptom is that the Linux builds finish immediately while the macOS build hangs in yellow forever:

2020-12-25-043804_590x265_scrot

In fact even Travis on nearly (but not exactly the same) version of macOS works fine.

For example, consider this Actions Workflow and this this Travis script for comparison, everything finishes in about a minute except for Actions-macOS.

This should only take a moment to finish, and on Travis it does:

2020-12-25-044911_991x287_scrot

but on Actions it’s at 3 minutes and counting. I’ve had jobs hung much longer too – up to their 6 hour limit – before I noticed what was going on:

2020-12-25-043814_969x287_scrot

The only way to stop the job is to cancel it. It never undeadlocks.

Job Log Output

Runner and Worker’s Diagnostic Logs

I don’t have access to these! If I install a runner locally and reproduce there I’ll update this.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:5
  • Comments:12 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
tavianatorcommented, Jan 21, 2022

genv --default-signal=PIPE yes | head -n7 is the only workaround I could figure out. Needs GNU coreutils installed from e.g. Homebrew.

1reaction
tavianatorcommented, Jan 21, 2022

I am running into this here: https://github.com/tavianator/homebrew-tap/runs/4874104560?check_suite_focus=true. I think I found the issue: macOS uses node to spawn the shell to work around an SIP issue: https://github.com/actions/runner/blob/c95d5eae3092358e0027f4d26b9962eedf2af93c/src/Runner.Worker/Handlers/ScriptHandler.cs#L270-L279

macos-run-invoker.js calls the shell with spawn(): https://github.com/actions/runner/blob/c95d5eae3092358e0027f4d26b9962eedf2af93c/src/Misc/layoutbin/macos-run-invoker.js#L1-L13

Old versions of node have a bug in spawn() that runs the child with SIGPIPE ignored: https://github.com/nodejs/node/issues/13662. Apparently macOS yes doesn’t check for EPIPE, it just expects to die from SIGPIPE, so it hangs forever writing y to a broken pipe.

I suspect using Node 16 instead of Node 12 would fix it, but I can’t confirm easily because I don’t have hands-on access to macOS.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Bash redirection: named pipes and EOF
1 Answer. The problem in approach 3 is that the FIFO pipe then has 2 writers: The bash script (because you have opened...
Read more >
Turn off buffering in pipe
It sounds like your long running process is not flushing its own buffer frequently enough. Changing the pipe's buffer size would be a...
Read more >
processx internals
Pass the socket (or pipe) name to it. E.g. you can pass it as a command line argument. Note, however that passing the...
Read more >
PTY: Data written to stdout after EOF on stdin is lost
It looks like the problem here is not that EOF doesn't work. It works, but the data that is written to stdout after...
Read more >
Passing two arguments to a command using pipes - bash
@firebat no, pipe means to take the stdout of the previous command as standard input of the next command. You can still have...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found