`cml runner` failure condition
See original GitHub issueSimilar to #906 Rerunning the workflow resulted in a successful run.
cml runner
cmd
cml-runner \
--single \
--labels=cml-prod \
--idle-timeout=3600 \
--token=*** \
--cloud=gcp \
--cloud-region=us-west \
--cloud-type=e2-standard-16
logs
{"level":"error","message":"terraform -chdir='/home/runner/.cml/cml-n3m3zg7li8' apply -auto-approve
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# iterative_cml_runner.runner will be created
+ resource "iterative_cml_runner" "runner" {
+ cloud = "gcp"
+ cml_version = "0.12.0"
+ docker_volumes = []
+ driver = "github"
+ id = (known after apply)
+ idle_timeout = 3600
+ instance_hdd_size = 35
+ instance_ip = (known after apply)
+ instance_launch_time = (known after apply)
+ instance_type = "e2-standard-16"
+ labels = "cml-prod"
+ name = "cml-n3m3zg7li8"
+ region = "us-west"
+ repo = "xxx"
+ single = true
+ spot = false
+ spot_price = -1
+ ssh_public = (known after apply)
+ token = (sensitive value)
}
Plan: 1 to add, 0 to change, 0 to destroy.
iterative_cml_runner.runner: Creating...
iterative_cml_runner.runner: Still creating... [10s elapsed]
....
iterative_cml_runner.runner: Still creating... [10m20s elapsed]
β·
β Error: Error checking the runner status
β
β with iterative_cml_runner.runner,
β on main.tf line 14, in resource "iterative_cml_runner" "runner":
β 14: resource "iterative_cml_runner" "runner" {
β
β -- Logs begin at Sat 2022-03-26 14:07:39 UTC, end at Sat 2022-03-26
β 14:17:22 UTC. --
β -- No entries --
β
β΅
","stack":"Error: terraform -chdir='/home/runner/.cml/cml-n3m3zg7li8' apply -auto-approve
Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# iterative_cml_runner.runner will be created
+ resource "iterative_cml_runner" "runner" {
+ cloud = "gcp"
+ cml_version = "0.12.0"
+ docker_volumes = []
+ driver = "github"
+ id = (known after apply)
+ idle_timeout = 3600
+ instance_hdd_size = 35
+ instance_ip = (known after apply)
+ instance_launch_time = (known after apply)
+ instance_type = "e2-standard-16"
+ labels = "cml-prod"
+ name = "cml-n3m3zg7li8"
+ region = "us-west"
+ repo = "xxx"
+ single = true
+ spot = false
+ spot_price = -1
+ ssh_public = (known after apply)
+ token = (sensitive value)
}
Plan: 1 to add, 0 to change, 0 to destroy.
iterative_cml_runner.runner: Creating...
iterative_cml_runner.runner: Still creating... [10s elapsed]
...
iterative_cml_runner.runner: Still creating... [10m20s elapsed]
β·
β Error: Error checking the runner status
β
β with iterative_cml_runner.runner,
β on main.tf line 14, in resource "iterative_cml_runner" "runner":
β 14: resource "iterative_cml_runner" "runner" {
β
β -- Logs begin at Sat 2022-03-26 14:07:39 UTC, end at Sat 2022-03-26
β 14:17:22 UTC. --
β -- No entries --
β
β΅
at /usr/local/lib/node_modules/@dvcorg/cml/src/utils.js:20:27
at ChildProcess.exithandler (node:child_process:406:5)
at ChildProcess.emit (node:events:520:28)
at maybeClose (node:internal/child_process:1092:16)
at Process.ChildProcess._handle.onexit (node:internal/child_process:302:5)","status":"terminated"}
instance cml.service
log
-- Logs begin at Sat 2022-03-26 14:07:39 UTC. --
Mar 26 14:20:19 cml-n3m3zg7li8 systemd[1]: Started cml.service.
Mar 26 14:30:00 cml-n3m3zg7li8 cml.sh[19171]: {"level":"error","message":"terraform -chdir='/tmp/tmp.33duRrCLVW/.cml/cml-n3m3zg7li8' init
Initializing the backend...
Initializing provider plugins...
- Finding latest version of iterative/iterative...
- Installing iterative/iterative v0.10.2...
β·
β Error: Failed to install provider
β
β Error while installing iterative/iterative v0.10.2: read tcp
β 10.138.0.27:44836->185.199.111.133:443: read: connection reset by peer
β΅
","stack":"Error: terraform -chdir='/tmp/tmp.33duRrCLVW/.cml/cml-n3m3zg7li8' init
Initializing the backend...
Initializing provider plugins...
- Finding latest version of iterative/iterative...
- Installing iterative/iterative v0.10.2...
β·
β Error: Failed to install provider
β
β Error while installing iterative/iterative v0.10.2: read tcp
β 10.138.0.27:44836->185.199.111.133:443: read: connection reset by peer
β΅
at /snapshot/cml/src/utils.js:20:27
at ChildProcess.exithandler (node:child_process:404:5)
at ChildProcess.emit (node:events:390:28)
at maybeClose (node:internal/child_process:1064:16)
at Process.ChildProcess._handle.onexit (node:internal/child_process:301:5)","status":"terminated"}
Mar 26 14:30:00 cml-n3m3zg7li8 cml.sh[19171]: {"level":"info","message":"waiting 10 seconds before exiting..."}
Mar 26 14:30:10 cml-n3m3zg7li8 cml.sh[19171]: {"level":"error","message":" Failed destroying terraform: terraform -chdir='/tmp/tmp.33duRrCLVW/.cml/cml-n3m3zg7li8' destroy -auto-approve
β·
β Error: Inconsistent dependency lock file
β
β The following dependency selections recorded in the lock file are
β inconsistent with the current configuration:
β - provider registry.terraform.io/iterative/iterative: required by this configuration but no version is selected
β
β To make the initial dependency selections that will initialize the
β dependency lock file, run:
β terraform init
β΅
"}
Mar 26 14:30:10 cml-n3m3zg7li8 systemd[1]: cml.service: Main process exited, code=exited, status=1/FAILURE
Mar 26 14:30:10 cml-n3m3zg7li8 systemd[1]: cml.service: Failed with result 'exit-code'.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)
Top Results From Across the Web
`cml runner` failure edge case Β· Issue #906 - GitHub
Documenting an edge case that happened during a daily pulse check, the cml runner command failed but created an instance that was left...
Read more >Common CML Errors and Solutions | CDP Private Cloud
ML workspace provisioning fails because CDP could not get access to all the AWS resources needed to deploy a CML workspace. This is...
Read more >Self-hosted Runners | CML
If cml runner fails with a Terraform error message, setting the environment variable TF_LOG_PROVIDER=DEBUG may yield more information. In very rare cases, youΒ ......
Read more >Troubleshoot self-hosted runner - CircleCI
The following are errors you could encounter using container runner. Container fails to start due to disk space. The task remains in the...
Read more >Clinical Trial Search Results | Stanford Cancer Institute | Stanford ...
either time of initial CML diagnosis or at time of study screening: Cytogenetics must be performed by ... Hereditary bone marrow failure disorder....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Upcoming release in TPI will fix this
Indeed, It would be nice for the invoking side to do a destroy if it failed so it doesnβt litter instances.