question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`cml-runner` times out with ssh handshake failure

See original GitHub issue

Hey everyone, I’m new to CML and I’m trying to use the cml-runner in order to schedule some optimisations.

I can see the EC2 instances get properly deployed, but the cml-runner command get stuck in the Terraform apply... stage. The EC2 instances stay there until the timeout and then shut themselves down. Meanwhile the cml-runner is still stuck waiting for terraform to finish applying. I’m forced to then just cancel the workflow

looking into this a little further, i can see that terraform apply itself times out with a handshake failure. If i look at the details of the EC2 instance that is deployed, i can see that it doesn’t have a public ipv4 address, and it’s in a private subnet (not a part of our default VPC). Isn’t this going to prevent the github actions runner from handshaking with it? Any ideas on what could be causing this behaviour?

Here is the error trace:

Preparing workdir /home/runner/.cml/cml-quzgh0u7bx...
Deploying cloud runner plan...
Terraform apply...
{"level":"error","status":"terminated"}
Error: terraform -chdir='/home/runner/.cml/cml-quzgh0u7bx' apply -auto-approve
	iterative_cml_runner.runner: Creating...
iterative_cml_runner.runner: Still creating... [10s elapsed]
iterative_cml_runner.runner: Still creating... [20s elapsed]
iterative_cml_runner.runner: Still creating... [30s elapsed]
iterative_cml_runner.runner: Still creating... [40s elapsed]
iterative_cml_runner.runner: Still creating... [10m20s elapsed]
│ Error: Error checking the runner status
│ 
│   on main.tf line 14, in resource "iterative_cml_runner" "runner":
│   14: resource "iterative_cml_runner" "runner" {
│ 
│ 
│ ssh: handshake failed: ssh: unable to authenticate, attempted methods [none
│ publickey], no supported methods remain
╵
    at /usr/local/lib/node_modules/@dvcorg/cml/src/utils.js:15:27
    at ChildProcess.exithandler (child_process.js:315:5)
    at ChildProcess.emit (events.js:315:20)
    at maybeClose (internal/child_process.js:1048:16)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:288:5)
iterative_cml_runner.runner: Refreshing state... [id=iterative-2gjlae369da1t]
iterative_cml_runner.runner: Destroying... [id=iterative-2gjlae369da1t]
iterative_cml_runner.runner: Destruction complete after 1s
Destroy complete! Resources: 1 destroyed.
[0
Error: Process completed with exit code 1.

The github token I’m providing to the script has the following permissions attached:

image

The AWS credentials have the permissions outlined in #429

The github action file that this is based on is

name: Run-Engine-Tests

on:
  push:
    branch: spike/poddie_cicd

jobs:

  deploy_runners:
    name: Deploy EC2 Instances

    # strategy:
    #   matrix:
    #     batch_id: [0]
    #     case_name: ['test']
        # batch_id: [0, 1, 2]
        # case_name: ['ONSHORE_PIPELINE', 'OFFSHORE_PIPELINE']

    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v2

      - name: Setup CML
        uses: iterative/setup-cml@v1

      - name: "Deploy runner on EC2"
        shell: bash
        env:
          repo_token: ${{ secrets.ACCESS_TOKEN_CML }}
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID_temp }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY_temp }}

        run: |
          cml-runner \
          --cloud aws \
          --cloud-region eu-west-2 \
          --cloud-type=t2.micro \

Thanks a lot in advance!

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:8 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
0x2b3bfa0commented, Apr 30, 2021

Glad it worked, @thatGreekGuy96! I’m closing this issue in favor of https://github.com/iterative/terraform-provider-iterative/issues/107.

I’m inclined to think that creating an ad-hoc VPC could be better for the user experience, but that’s open for discussion anyway.

1reaction
thatGreekGuy96commented, Apr 30, 2021

@0x2b3bfa0 ok well thanks for your help in any case 😄 Turns out the we didn’t need that other VPC, it was just a leftover from our old infrastructure so we’ve gone ahead and deleted it now and stuff works great!

That being said, you might want to change the behaviour here or to make it a little bit clearer in the docs. Creating a VPC with all the right settings, or allowing the user to set the VPC themselves might do the trick?

Read more comments on GitHub >

github_iconTop Results From Across the Web

ssh: handshake failed on every attempt · Issue #80 - GitHub
Hey there,. I'm stuck and wasn't able to find a solution in reading previous issues. I keep getting the following error:
Read more >
handshake failed: ssh: unable to authenticate, - GitLab CI/CD
Hello , Sorry for my approximative langage in my issue description. I'm running a gitlab server (centos7.3) with CE Omnibus ...
Read more >
SSH to a Container Fails with "ssh: handshake failed" Error ...
After this is done : I tried to SSH to an app by granting spacedeveloper of the space where the app is deployed....
Read more >
Why are all my SSH attempts failing due to timeout?
That error message means the server to which you are connecting does not reply to SSH connection attempts on port 22.
Read more >
What Is SSL Handshake & How Do I Fix SSL ... - HubSpot Blog
An SSL handshake is an essential step in keeping data transferred over the internet secure. Learn what the SSL Handshake Failed error means ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found