How to handle upgrade with ephemeral
See original GitHub issueTo support ephemeral runners as docker containers, we created an init script which runs the following:
/config.sh <arguments>
/run.sh
We’ve noticed that if the actions runner version is old, the runner will self update then exit without actually running a job or de-registering. When there is no upgrade the runner works correctly.
We are using the latest ubuntu docker image as a base.
Here is the log from the container:
# Authentication
√ Connected to GitHub
# Runner Registration
√ Runner successfully added
√ Runner connection is good
# Runner settings
√ Settings Saved.
√ Connected to GitHub
2021-10-01 20:50:54Z: Listening for Jobs
Runner update in progress, do not shutdown runner.
Downloading 2.283.2 runner
Waiting for current job finish running.
Generate and execute update script.
Runner will exit shortly for update, should be back online within 10 seconds.
√ Removed .credentials
√ Removed .runner
Is there a way to either skip the upgrade or have the runner process a job?
Issue Analytics
- State:
- Created 2 years ago
- Reactions:4
- Comments:18 (6 by maintainers)
Top Results From Across the Web
Ephemeral Containers
Ephemeral containers are created using a special ephemeralcontainers handler in the API rather than by adding them directly to pod.spec , so ...
Read more >Ephemeral Volumes
This document describes ephemeral volumes in Kubernetes. Familiarity with volumes is suggested, in particular PersistentVolumeClaim and PersistentVolume.
Read more >Using kubernetes custom resources to manage our ...
We use kubernetes custom resources to model the state of each ephemeral environment. The result is a powerful way to manage a complex...
Read more >Using larger ephemeral storage for AWS Lambda
Process files larger than the 10,240 MB storage allows. Use file-system type functionality, such as appending to or modifying files. Conclusion.
Read more >Deploy Ephemeral OS disks - Azure Virtual Machines
The process to create a VM that uses ephemeral OS disks is to add the diffDiskSettings property to Microsoft.Compute/virtualMachines resource ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
We’re adding an option to allow self-hosted ephemeral runners to opt-out of automatic updates so that you can manage updates yourself.
Some background: we consider the runner software and the hosted Actions software as a cohesive whole. Many times when we add a new feature to GitHub Actions, these changes need to be made both on the hosted service and in the runner - for example, when we added conditional steps to composite actions. This is why we’ve always required runner updates, so that we can be sure that the runner is compatible with the service version.
Obviously this is a painful requirement for many ephemeral users. So we’ll add an opt-out mechanism for ephemeral, where the runner will not try to do a self-update. This flag will allow you to control when you update your runners.
Because the runner versions are so tightly coupled to the overall service, you’ll be required to update within a month of a new runner version being released. After a month, your runners will no longer be able to connect to GitHub, so you will need to perform updates regularly. Immediately upon a new release, the runner will begin notifying you when an update is available on stdout and stderr. We’ll also start adding annotations to workflow runs on outdated runners.
This is in development now and we plan to have it generally available in the new year.
@TingluoHuang #1384 seems relevant, but I think the crux of this problem is that the auto update procedure creates a new process before the old one waits around for a few seconds and exits. The old process does not appear to check for success on upgrade, it just waits a fixed time and exits. In a containerized runner, when this happens all processes in the container are killed, whether or not the upgrade actually had time to complete. I see this routinely.
This changing pid throughout the upgrade procedure doesn’t play well with containerized runners that don’t also embed their own service manager (systemd). This is why I question whether ephemeral runners should be subject to auto upgrade at all.
People are expecting the —ephemeral option to finally bring with it basic support for orchestrating containerized runners and I’m not sure it does just yet. The design of the upgrade process seems to be a blocker.
personally, I’d be fine with a —autoupdate=false option being available either with or without —ephemeral.