Queueing of self-hosted runners is busted
See original GitHub issueUPDATE: Updated title and description of issue based on further findings.
Summary:
“If a job runs but there are no runners available (either they are all busy with other jobs or all offline) then it no longer waits but rather aborts immediately. This doesn’t appear to be anything to do with the runners themselves but rather be whatever is responsible on GitHub’s severs to dispatch job requests to available runners.” - @Bo98
Details:
We have ephemeral self-hosted runners running in AWS. When a new workflow is triggered on a repo we have github webhooks calling an api that runs ephemeral runners in AWS ECS Fargate. We are using a custom label to choose which ephemeral runner to use. In order to do this we had to register a runner with that label and then leave the “offline” runner entry available in the action runner list for the repos. If we didn’t do this we would get an error saying that no runner with the label existed. This has worked just fine up until a few days ago. Now we have to have a runner with the “label” that is online or idle. Otherwise we get the following error:
An image label with the label menuintegrationdev does not exist.
The remote provider was unable to process the request.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (2 by maintainers)
Top GitHub Comments
Fix was rolled out.
We’ve identified the bug and currently rolling out the fix. Will update again once the fix is rolled out.