`cml runner` failure edge case
See original GitHub issueDocumenting an edge case that happened during a daily pulse check, the cml runner
command failed but created an instance that was left on, Letβs call it βunattendedβ or βorphanedβ?
ran with
cml-runner \
--single \
--log=debug \
--idle-timeout=1800 \
--token=*** \
--cloud=gcp \
--cloud-region=us-west \
--cloud-type=e2-standard-2 \
--cloud-hdd-size=10
de-potatoed log
{"level":"warn","message":"ignoring RUNNER_NAME environment variable, use CML_RUNNER_NAME or --name instead"}
{"level":"info","message":"Preparing workdir /home/runner/.cml/cml-oqbyyfr3qf..."}
{"level":"info","message":"Deploying cloud runner plan..."}
{"level":"info","message":"Terraform apply..."}
{"level":"error","message":"terraform -chdir='/home/runner/.cml/cml-oqbyyfr3qf' apply -auto-approve
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# iterative_cml_runner.runner will be created
+ resource "iterative_cml_runner" "runner" {
+ cloud = "gcp"
+ docker_volumes = []
+ driver = "github"
+ id = (known after apply)
+ idle_timeout = 1800
+ instance_hdd_size = 10
+ instance_ip = (known after apply)
+ instance_launch_time = (known after apply)
+ instance_type = "e2-standard-2"
+ labels = "cml"
+ name = "cml-oqbyyfr3qf"
+ region = "us-west"
+ repo = "xxxx"
+ single = true
+ spot = false
+ spot_price = -1
+ ssh_public = (known after apply)
+ token = (sensitive value)
***
Plan: 1 to add, 0 to change, 0 to destroy.
iterative_cml_runner.runner: Creating...
iterative_cml_runner.runner: Still creating... [10s elapsed]
iterative_cml_runner.runner: Still creating... [20s elapsed]
iterative_cml_runner.runner: Still creating... [30s elapsed]
iterative_cml_runner.runner: Still creating... [40s elapsed]
iterative_cml_runner.runner: Still creating... [50s elapsed]
iterative_cml_runner.runner: Still creating... [1m0s elapsed]
iterative_cml_runner.runner: Still creating... [1m10s elapsed]
iterative_cml_runner.runner: Still creating... [1m20s elapsed]
iterative_cml_runner.runner: Still creating... [1m30s elapsed]
iterative_cml_runner.runner: Still creating... [1m40s elapsed]
iterative_cml_runner.runner: Still creating... [1m50s elapsed]
iterative_cml_runner.runner: Still creating... [2m0s elapsed]
iterative_cml_runner.runner: Still creating... [2m10s elapsed]
iterative_cml_runner.runner: Still creating... [2m20s elapsed]
iterative_cml_runner.runner: Still creating... [2m30s elapsed]
iterative_cml_runner.runner: Still creating... [2m40s elapsed]
iterative_cml_runner.runner: Still creating... [2m50s elapsed]
iterative_cml_runner.runner: Still creating... [3m0s elapsed]
iterative_cml_runner.runner: Still creating... [3m10s elapsed]
iterative_cml_runner.runner: Still creating... [3m20s elapsed]
iterative_cml_runner.runner: Still creating... [3m30s elapsed]
β·
β Error: Error checking the runner status
β
β with iterative_cml_runner.runner,
β on main.tf line 14, in resource "iterative_cml_runner" "runner":
β 14: resource "iterative_cml_runner" "runner" {
β
β -- Logs begin at Mon 2022-03-07 16:07:02 UTC, end at Mon 2022-03-07
β 16:10:01 UTC. --
β Mar 07 16:09:58 cml-oqbyyfr3qf systemd[1]: Started cml.service.
β Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β {"level":"error","message":"terraform
β -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β backend...\n\nInitializing provider plugins...\n- Finding latest version of
β iterative/iterative...\n\n\tβ·\nβ Error: Failed to install provider\nβ \nβ
β Error while installing iterative/iterative v0.9.14: could not query\nβ
β provider registry for registry.terraform.io/iterative/iterative: failed
β to\nβ retrieve authentication checksums for provider: 403
β Forbidden\nβ΅\n\n","stack":"Error: terraform
β -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β backend...\n\nInitializing provider plugins...\n- Finding latest version of
β iterative/iterative...\n\n\tβ·\nβ Error: Failed to install provider\nβ \nβ
β Error while installing iterative/iterative v0.9.14: could not query\nβ
β provider registry for registry.terraform.io/iterative/iterative: failed
β to\nβ retrieve authentication checksums for provider: 403
β Forbidden\nβ΅\n\n\n at
β /usr/lib/node_modules/@dvcorg/cml/src/utils.js:20:27\n at
β ChildProcess.exithandler (child_process.js:3[15](https://github.com/xxx/runs/5451314005?check_suite_focus=true#step:4:15):5)\n at ChildProcess.emit
β (events.js:314:20)\n at maybeClose (internal/child_process.js:1022:[16](xxx/runs/5451314005?check_suite_focus=true#step:4:16))\n
β at Process.ChildProcess._handle.onexit
β (internal/child_process.js:287:5)","status":"terminated"}
β Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β {"level":"info","message":"waiting 20 seconds before exiting..."}
β
β΅
","stack":"Error: terraform -chdir='/home/runner/.cml/cml-oqbyyfr3qf' apply -auto-approve
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# iterative_cml_runner.runner will be created
+ resource "iterative_cml_runner" "runner" {
+ cloud = "gcp"
+ docker_volumes = []
+ driver = "github"
+ id = (known after apply)
+ idle_timeout = 1800
+ instance_hdd_size = 10
+ instance_ip = (known after apply)
+ instance_launch_time = (known after apply)
+ instance_type = "e2-standard-2"
+ labels = "cml"
+ name = "cml-oqbyyfr3qf"
+ region = "us-west"
+ repo = "xxx"
+ single = true
+ spot = false
+ spot_price = -1
+ ssh_public = (known after apply)
+ token = (sensitive value)
}
Plan: 1 to add, 0 to change, 0 to destroy.
iterative_cml_runner.runner: Creating...
iterative_cml_runner.runner: Still creating... [10s elapsed]
iterative_cml_runner.runner: Still creating... [20s elapsed]
iterative_cml_runner.runner: Still creating... [30s elapsed]
iterative_cml_runner.runner: Still creating... [40s elapsed]
iterative_cml_runner.runner: Still creating... [50s elapsed]
iterative_cml_runner.runner: Still creating... [1m0s elapsed]
iterative_cml_runner.runner: Still creating... [1m10s elapsed]
iterative_cml_runner.runner: Still creating... [1m20s elapsed]
iterative_cml_runner.runner: Still creating... [1m30s elapsed]
iterative_cml_runner.runner: Still creating... [1m40s elapsed]
iterative_cml_runner.runner: Still creating... [1m50s elapsed]
iterative_cml_runner.runner: Still creating... [2m0s elapsed]
iterative_cml_runner.runner: Still creating... [2m10s elapsed]
iterative_cml_runner.runner: Still creating... [2m20s elapsed]
iterative_cml_runner.runner: Still creating... [2m30s elapsed]
iterative_cml_runner.runner: Still creating... [2m40s elapsed]
iterative_cml_runner.runner: Still creating... [2m50s elapsed]
iterative_cml_runner.runner: Still creating... [3m0s elapsed]
iterative_cml_runner.runner: Still creating... [3m10s elapsed]
iterative_cml_runner.runner: Still creating... [3m20s elapsed]
iterative_cml_runner.runner: Still creating... [3m30s elapsed]
β·
β Error: Error checking the runner status
β
β with iterative_cml_runner.runner,
β on main.tf line 14, in resource "iterative_cml_runner" "runner":
β 14: resource "iterative_cml_runner" "runner" {
β
β -- Logs begin at Mon 2022-03-07 16:07:02 UTC, end at Mon 2022-03-07
β 16:10:01 UTC. --
β Mar 07 16:09:58 cml-oqbyyfr3qf systemd[1]: Started cml.service.
β Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β ***"level":"error","message":"terraform
β -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β backend...\n\nInitializing provider plugins...\n- Finding latest version of
β iterative/iterative...\n\n\tβ·\nβ Error: Failed to install provider\nβ \nβ
β Error while installing iterative/iterative v0.9.14: could not query\nβ
β provider registry for registry.terraform.io/iterative/iterative: failed
β to\nβ retrieve authentication checksums for provider: 403
β Forbidden\nβ΅\n\n","stack":"Error: terraform
β -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β backend...\n\nInitializing provider plugins...\n- Finding latest version of
β iterative/iterative...\n\n\tβ·\nβ Error: Failed to install provider\nβ \nβ
β Error while installing iterative/iterative v0.9.14: could not query\nβ
β provider registry for registry.terraform.io/iterative/iterative: failed
β to\nβ retrieve authentication checksums for provider: 403
β Forbidden\nβ΅\n\n\n at
β /usr/lib/node_modules/@dvcorg/cml/src/utils.js:20:27\n at
β ChildProcess.exithandler (child_process.js:315:5)\n at ChildProcess.emit
β (events.js:314:20)\n at maybeClose (internal/child_process.js:1022:16)\n
β at Process.ChildProcess._handle.onexit
β (internal/child_process.js:287:5)","status":"terminated"***
β Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β ***"level":"info","message":"waiting 20 seconds before exiting..."***
β
β΅
at /usr/local/lib/node_modules/@dvcorg/cml/src/utils.js:20:27
at ChildProcess.exithandler (node:child_process:406:5)
at ChildProcess.emit (node:events:520:28)
at maybeClose (node:internal/child_process:1092:16)
at Process.ChildProcess._handle.onexit (node:internal/child_process:302:5)","status":"terminated" }
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Self-hosted Runners | CML
The following is a list of all the resources you may need to manually clean up in the case of a failure: The...
Read more >3rd How to Diagnose and Treat: CML/MPN | ESH
ESH How to Diagnose and Treat meetings are disease-specific meetings that address state-of-the-art diagnostic and clinical management. They are based on theΒ ...
Read more >C:\Progra~1\cml-a.com
When running it I pass the debugger my source file listing. ... Summary: I'm making a case for a certain type of debugger....
Read more >Flip64 CML Developer's Guide - Telestream
As you design, develop and process Flip64 CML, you may encounter problems. Here are ... CML elements must be entered in title caseβwith...
Read more >FAQ - Cisco Modeling Labs v2.x
The copy process will fail if there are node VMs running, if there is no REFPLAT ISO mounted, or if you have insufficient...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@dacbd I have tried a cloud runner and I can confirm that it works. It might be a transient registry error
If it does reoccur I think: https://github.com/iterative/cml/pull/1052 will be helpful