question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Self-hosted runners terminated prematurely

See original GitHub issue

Add insult to injury, machines aren’t being terminated after the failure:

$ journalctl --unit cml
-- Logs begin at Sat 2022-10-08 21:59:41 UTC, end at Thu 2022-11-17 14:42:27 UTC. --
Nov 17 10:53:35 ip-172-31-81-177 systemd[1]: Started cml.service.
Nov 17 10:53:35 ip-172-31-81-177 cml.sh[2214]:   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Nov 17 10:53:35 ip-172-31-81-177 cml.sh[2214]:                                  Dload  Upload   Total   Spent    Left  Speed
Nov 17 10:53:35 ip-172-31-81-177 cml.sh[2214]: [158B blob data]
Nov 17 10:53:36 ip-172-31-81-177 cml.sh[2214]: {"level":"warn","message":"Github Actions timeout has been updated from 72h to 35 days. Update your workflow accordingly to be able to restart it automatically."}
Nov 17 10:53:36 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Preparing workdir /tmp/tmp.FFQVuNxrbe/.cml/cml-679w0e9r8o..."}
Nov 17 10:53:36 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Launching github runner"}
Nov 17 10:53:41 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Terraform 1.3.2"}
Nov 17 10:53:41 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Plan: 0 to add, 0 to change, 0 to destroy."}
Nov 17 10:53:41 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Apply complete! Resources: 0 added, 0 changed, 0 destroyed."}
Nov 17 10:53:41 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Outputs: 0"}
Nov 17 10:53:41 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Connected to acpid service."}
Nov 17 10:53:51 ip-172-31-81-177 cml.sh[2214]: {"date":"2022-11-17T10:53:51.690Z","level":"info","message":"runner status","repo":"https://github.com/iterative/cml-textual-inversion","status":"ready"}
Nov 17 10:58:23 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 153 seconds!"}
Nov 17 11:00:56 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 11:00:56 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 11:00:56 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 11:00:56 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 11:04:40 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 3377 seconds!"}
Nov 17 12:00:57 ip-172-31-81-177 cml.sh[2214]: {"date":"Thu Nov 17 2022 12:00:57 GMT+0000 (Coordinated Universal Time)","error":{"name":"HttpError","request":{"headers":{"accept":"application/vnd.github.v3+json","authorization":"to
Nov 17 12:00:57 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Unregistering runner cml-679w0e9r8o..."}
Nov 17 12:00:57 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 12:00:57 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 12:00:57 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 12:00:57 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 12:00:57 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Retrying after 0 seconds!"}
Nov 17 12:00:58 ip-172-31-81-177 cml.sh[2214]: {"level":"error","message":"\tFailed: Bad request - Runner \"cml-679w0e9r8o\" is still running a job\""}
Nov 17 12:00:58 ip-172-31-81-177 cml.sh[2214]: {"level":"info","message":"Waiting 10 seconds to destroy"}
Nov 17 12:01:00 ip-172-31-81-177 systemd[1]: cml.service: Main process exited, code=exited, status=1/FAILURE
Nov 17 12:01:02 ip-172-31-81-177 systemd[1]: cml.service: Failed with result 'exit-code'.

[^1]: Pass also e.g. --cloud-startup-script=$(echo 'curl https://github.com/0x2b3bfa0.keys >> /home/ubuntu/.ssh/authorized_keys' | base64 -w 0) to ensure SSH access.

Issue Analytics

  • State:open
  • Created 10 months ago
  • Reactions:2
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

4reactions
dacbdcommented, Nov 14, 2022

Have an update on this yet?

1reaction
0x2b3bfa0commented, Dec 29, 2022

Moreover, rate limits were being hit probably because there is an await missing here:

https://github.com/iterative/cml/blob/8e26590385fba6aedf9355ab483a43f392ea283e/src/drivers/github.js#L381

Read more comments on GitHub >

github_iconTop Results From Across the Web

Self hosted runner in Linux container exits job prematurely #921
Job is exiting (github runner is cancelling the job) prematurely, during a memory intensive portion. The step that we are exiting consistently ...
Read more >
Automatically scale self-hosted runners in AWS to meet demand
Select the Enable scale-in protection checkbox to will protect instances from being terminated prematurely. Jobs may be completed out of the ...
Read more >
Managing self hosted CI runners at scale with EC2 spot ...
A cost effective solution for running self hosted runners at scale using Github actions and AWS EC2 Spot instances As engineering teams grow ......
Read more >
How we streamlined Apple M1 Support with self-hosted ...
A self-hosted solution. We ended up setting up a self-hosted GitHub Actions runner, on a hosted Mac M1 that we rent from MacStadium....
Read more >
GitHub self hosted runners on AWS Spot | Lothar Schulz
That enables Github Action job execution after the ssh connection has been terminated. For a permanent installation, one may start the runner ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found