question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItΒ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`cml runner` failure edge case

See original GitHub issue

Documenting an edge case that happened during a daily pulse check, the cml runner command failed but created an instance that was left on, Let’s call it β€œunattended” or β€œorphaned”?

ran with

cml-runner \
    --single \
    --log=debug \
    --idle-timeout=1800 \
    --token=*** \
    --cloud=gcp \
    --cloud-region=us-west \
    --cloud-type=e2-standard-2 \
    --cloud-hdd-size=10

de-potatoed log

{"level":"warn","message":"ignoring RUNNER_NAME environment variable, use CML_RUNNER_NAME or --name instead"}
{"level":"info","message":"Preparing workdir /home/runner/.cml/cml-oqbyyfr3qf..."}
{"level":"info","message":"Deploying cloud runner plan..."}
{"level":"info","message":"Terraform apply..."}
{"level":"error","message":"terraform -chdir='/home/runner/.cml/cml-oqbyyfr3qf' apply -auto-approve
	
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # iterative_cml_runner.runner will be created
  + resource "iterative_cml_runner" "runner" {
      + cloud                = "gcp"
      + docker_volumes       = []
      + driver               = "github"
      + id                   = (known after apply)
      + idle_timeout         = 1800
      + instance_hdd_size    = 10
      + instance_ip          = (known after apply)
      + instance_launch_time = (known after apply)
      + instance_type        = "e2-standard-2"
      + labels               = "cml"
      + name                 = "cml-oqbyyfr3qf"
      + region               = "us-west"
      + repo                 = "xxxx"
      + single               = true
      + spot                 = false
      + spot_price           = -1
      + ssh_public           = (known after apply)
      + token                = (sensitive value)
    ***

Plan: 1 to add, 0 to change, 0 to destroy.
iterative_cml_runner.runner: Creating...
iterative_cml_runner.runner: Still creating... [10s elapsed]
iterative_cml_runner.runner: Still creating... [20s elapsed]
iterative_cml_runner.runner: Still creating... [30s elapsed]
iterative_cml_runner.runner: Still creating... [40s elapsed]
iterative_cml_runner.runner: Still creating... [50s elapsed]
iterative_cml_runner.runner: Still creating... [1m0s elapsed]
iterative_cml_runner.runner: Still creating... [1m10s elapsed]
iterative_cml_runner.runner: Still creating... [1m20s elapsed]
iterative_cml_runner.runner: Still creating... [1m30s elapsed]
iterative_cml_runner.runner: Still creating... [1m40s elapsed]
iterative_cml_runner.runner: Still creating... [1m50s elapsed]
iterative_cml_runner.runner: Still creating... [2m0s elapsed]
iterative_cml_runner.runner: Still creating... [2m10s elapsed]
iterative_cml_runner.runner: Still creating... [2m20s elapsed]
iterative_cml_runner.runner: Still creating... [2m30s elapsed]
iterative_cml_runner.runner: Still creating... [2m40s elapsed]
iterative_cml_runner.runner: Still creating... [2m50s elapsed]
iterative_cml_runner.runner: Still creating... [3m0s elapsed]
iterative_cml_runner.runner: Still creating... [3m10s elapsed]
iterative_cml_runner.runner: Still creating... [3m20s elapsed]
iterative_cml_runner.runner: Still creating... [3m30s elapsed]

	β•·
β”‚ Error: Error checking the runner status
β”‚ 
β”‚   with iterative_cml_runner.runner,
β”‚   on main.tf line 14, in resource "iterative_cml_runner" "runner":
β”‚   14: resource "iterative_cml_runner" "runner" {
β”‚ 
β”‚ -- Logs begin at Mon 2022-03-07 16:07:02 UTC, end at Mon 2022-03-07
β”‚ 16:10:01 UTC. --
β”‚ Mar 07 16:09:58 cml-oqbyyfr3qf systemd[1]: Started cml.service.
β”‚ Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β”‚ {"level":"error","message":"terraform
β”‚ -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β”‚ backend...\n\nInitializing provider plugins...\n- Finding latest version of
β”‚ iterative/iterative...\n\n\tβ•·\nβ”‚ Error: Failed to install provider\nβ”‚ \nβ”‚
β”‚ Error while installing iterative/iterative v0.9.14: could not query\nβ”‚
β”‚ provider registry for registry.terraform.io/iterative/iterative: failed
β”‚ to\nβ”‚ retrieve authentication checksums for provider: 403
β”‚ Forbidden\nβ•΅\n\n","stack":"Error: terraform
β”‚ -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β”‚ backend...\n\nInitializing provider plugins...\n- Finding latest version of
β”‚ iterative/iterative...\n\n\tβ•·\nβ”‚ Error: Failed to install provider\nβ”‚ \nβ”‚
β”‚ Error while installing iterative/iterative v0.9.14: could not query\nβ”‚
β”‚ provider registry for registry.terraform.io/iterative/iterative: failed
β”‚ to\nβ”‚ retrieve authentication checksums for provider: 403
β”‚ Forbidden\nβ•΅\n\n\n    at
β”‚ /usr/lib/node_modules/@dvcorg/cml/src/utils.js:20:27\n    at
β”‚ ChildProcess.exithandler (child_process.js:3[15](https://github.com/xxx/runs/5451314005?check_suite_focus=true#step:4:15):5)\n    at ChildProcess.emit
β”‚ (events.js:314:20)\n    at maybeClose (internal/child_process.js:1022:[16](xxx/runs/5451314005?check_suite_focus=true#step:4:16))\n
β”‚ at Process.ChildProcess._handle.onexit
β”‚ (internal/child_process.js:287:5)","status":"terminated"}
β”‚ Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β”‚ {"level":"info","message":"waiting 20 seconds before exiting..."}
β”‚ 
β•΅
","stack":"Error: terraform -chdir='/home/runner/.cml/cml-oqbyyfr3qf' apply -auto-approve
	
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # iterative_cml_runner.runner will be created
  + resource "iterative_cml_runner" "runner" {
      + cloud                = "gcp"
      + docker_volumes       = []
      + driver               = "github"
      + id                   = (known after apply)
      + idle_timeout         = 1800
      + instance_hdd_size    = 10
      + instance_ip          = (known after apply)
      + instance_launch_time = (known after apply)
      + instance_type        = "e2-standard-2"
      + labels               = "cml"
      + name                 = "cml-oqbyyfr3qf"
      + region               = "us-west"
      + repo                 = "xxx"
      + single               = true
      + spot                 = false
      + spot_price           = -1
      + ssh_public           = (known after apply)
      + token                = (sensitive value)
    }

Plan: 1 to add, 0 to change, 0 to destroy.
iterative_cml_runner.runner: Creating...
iterative_cml_runner.runner: Still creating... [10s elapsed]
iterative_cml_runner.runner: Still creating... [20s elapsed]
iterative_cml_runner.runner: Still creating... [30s elapsed]
iterative_cml_runner.runner: Still creating... [40s elapsed]
iterative_cml_runner.runner: Still creating... [50s elapsed]
iterative_cml_runner.runner: Still creating... [1m0s elapsed]
iterative_cml_runner.runner: Still creating... [1m10s elapsed]
iterative_cml_runner.runner: Still creating... [1m20s elapsed]
iterative_cml_runner.runner: Still creating... [1m30s elapsed]
iterative_cml_runner.runner: Still creating... [1m40s elapsed]
iterative_cml_runner.runner: Still creating... [1m50s elapsed]
iterative_cml_runner.runner: Still creating... [2m0s elapsed]
iterative_cml_runner.runner: Still creating... [2m10s elapsed]
iterative_cml_runner.runner: Still creating... [2m20s elapsed]
iterative_cml_runner.runner: Still creating... [2m30s elapsed]
iterative_cml_runner.runner: Still creating... [2m40s elapsed]
iterative_cml_runner.runner: Still creating... [2m50s elapsed]
iterative_cml_runner.runner: Still creating... [3m0s elapsed]
iterative_cml_runner.runner: Still creating... [3m10s elapsed]
iterative_cml_runner.runner: Still creating... [3m20s elapsed]
iterative_cml_runner.runner: Still creating... [3m30s elapsed]

	β•·
β”‚ Error: Error checking the runner status
β”‚ 
β”‚   with iterative_cml_runner.runner,
β”‚   on main.tf line 14, in resource "iterative_cml_runner" "runner":
β”‚   14: resource "iterative_cml_runner" "runner" {
β”‚ 
β”‚ -- Logs begin at Mon 2022-03-07 16:07:02 UTC, end at Mon 2022-03-07
β”‚ 16:10:01 UTC. --
β”‚ Mar 07 16:09:58 cml-oqbyyfr3qf systemd[1]: Started cml.service.
β”‚ Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β”‚ ***"level":"error","message":"terraform
β”‚ -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β”‚ backend...\n\nInitializing provider plugins...\n- Finding latest version of
β”‚ iterative/iterative...\n\n\tβ•·\nβ”‚ Error: Failed to install provider\nβ”‚ \nβ”‚
β”‚ Error while installing iterative/iterative v0.9.14: could not query\nβ”‚
β”‚ provider registry for registry.terraform.io/iterative/iterative: failed
β”‚ to\nβ”‚ retrieve authentication checksums for provider: 403
β”‚ Forbidden\nβ•΅\n\n","stack":"Error: terraform
β”‚ -chdir='/tmp/tmp.sJzxlHtykd/.cml/cml-oqbyyfr3qf' init\n\t\nInitializing the
β”‚ backend...\n\nInitializing provider plugins...\n- Finding latest version of
β”‚ iterative/iterative...\n\n\tβ•·\nβ”‚ Error: Failed to install provider\nβ”‚ \nβ”‚
β”‚ Error while installing iterative/iterative v0.9.14: could not query\nβ”‚
β”‚ provider registry for registry.terraform.io/iterative/iterative: failed
β”‚ to\nβ”‚ retrieve authentication checksums for provider: 403
β”‚ Forbidden\nβ•΅\n\n\n    at
β”‚ /usr/lib/node_modules/@dvcorg/cml/src/utils.js:20:27\n    at
β”‚ ChildProcess.exithandler (child_process.js:315:5)\n    at ChildProcess.emit
β”‚ (events.js:314:20)\n    at maybeClose (internal/child_process.js:1022:16)\n
β”‚ at Process.ChildProcess._handle.onexit
β”‚ (internal/child_process.js:287:5)","status":"terminated"***
β”‚ Mar 07 16:10:00 cml-oqbyyfr3qf cml.sh[16091]:
β”‚ ***"level":"info","message":"waiting 20 seconds before exiting..."***
β”‚ 
β•΅

    at /usr/local/lib/node_modules/@dvcorg/cml/src/utils.js:20:27
    at ChildProcess.exithandler (node:child_process:406:5)
    at ChildProcess.emit (node:events:520:28)
    at maybeClose (node:internal/child_process:1092:16)
    at Process.ChildProcess._handle.onexit (node:internal/child_process:302:5)","status":"terminated" }

Issue Analytics

  • State:open
  • Created 2 years ago
  • Reactions:1
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
DavidGOrtegacommented, Mar 7, 2022

@dacbd I have tried a cloud runner and I can confirm that it works. It might be a transient registry error

1reaction
dacbdcommented, Jun 13, 2022

If it does reoccur I think: https://github.com/iterative/cml/pull/1052 will be helpful

Read more comments on GitHub >

github_iconTop Results From Across the Web

Self-hosted Runners | CML
The following is a list of all the resources you may need to manually clean up in the case of a failure: The...
Read more >
3rd How to Diagnose and Treat: CML/MPN | ESH
ESH How to Diagnose and Treat meetings are disease-specific meetings that address state-of-the-art diagnostic and clinical management. They are based on theΒ ...
Read more >
C:\Progra~1\cml-a.com
When running it I pass the debugger my source file listing. ... Summary: I'm making a case for a certain type of debugger....
Read more >
Flip64 CML Developer's Guide - Telestream
As you design, develop and process Flip64 CML, you may encounter problems. Here are ... CML elements must be entered in title caseβ€”with...
Read more >
FAQ - Cisco Modeling Labs v2.x
The copy process will fail if there are node VMs running, if there is no REFPLAT ISO mounted, or if you have insufficient...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found