Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Queueing of self-hosted runners is busted

See original GitHub issue

UPDATE: Updated title and description of issue based on further findings.

Summary:

“If a job runs but there are no runners available (either they are all busy with other jobs or all offline) then it no longer waits but rather aborts immediately. This doesn’t appear to be anything to do with the runners themselves but rather be whatever is responsible on GitHub’s severs to dispatch job requests to available runners.” - @Bo98

Details:

We have ephemeral self-hosted runners running in AWS. When a new workflow is triggered on a repo we have github webhooks calling an api that runs ephemeral runners in AWS ECS Fargate. We are using a custom label to choose which ephemeral runner to use. In order to do this we had to register a runner with that label and then leave the “offline” runner entry available in the action runner list for the repos. If we didn’t do this we would get an error saying that no runner with the label existed. This has worked just fine up until a few days ago. Now we have to have a runner with the “label” that is online or idle. Otherwise we get the following error:

An image label with the label menuintegrationdev does not exist.

The remote provider was unable to process the request.