question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Name or service not known

See original GitHub issue

When Lambda tries to deploy the changes it fails. Here’s the CloudWatch Logs dump:

START RequestId: f5ff58dd-fc68-11e7-8aaf-910e87942b5f Version: $LATEST

XXXXXXXXXXX.dkr.ecr.us-west-2.amazonaws.com/k8s-c-repos-1bdxoih448581 d8d49eb0 codesuite-demo

2018-01-18 16:02:22,662 WARNING Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7567c3a7f0>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/extensions/v1beta1/namespaces/default/deployments/codesuite-demo

[WARNING] 2018-01-18T16:02:22.662Z f5ff58dd-fc68-11e7-8aaf-910e87942b5f Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7567c3a7f0>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/extensions/v1beta1/namespaces/default/deployments/codesuite-demo

2018-01-18 16:02:22,663 WARNING Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7567c3afd0>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/extensions/v1beta1/namespaces/default/deployments/codesuite-demo

[WARNING] 2018-01-18T16:02:22.663Z f5ff58dd-fc68-11e7-8aaf-910e87942b5f Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7567c3afd0>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/extensions/v1beta1/namespaces/default/deployments/codesuite-demo

2018-01-18 16:02:22,665 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7567c3a7b8>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/extensions/v1beta1/namespaces/default/deployments/codesuite-demo

[WARNING] 2018-01-18T16:02:22.665Z f5ff58dd-fc68-11e7-8aaf-910e87942b5f Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7567c3a7b8>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/extensions/v1beta1/namespaces/default/deployments/codesuite-demo

HTTPSConnectionPool(host='XXXXXXXXXXXXXXXXXX.us-west-2.elb.amazonaws.com', port=443): Max retries exceeded with url: /apis/extensions/v1beta1/namespaces/default/deployments/codesuite-demo (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f7567c3a518>: Failed to establish a new connection: [Errno -2] Name or service not known',))

Here’s some information about my k8s cluster:

Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.1", GitCommit:"3a1c9449a956b6026f075fa3134ff92f7d55f812", GitTreeState:"clean", BuildDate:"2018-01-04T11:52:23Z", GoVersion:"go1.9.2", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.4", GitCommit:"9befc2b8928a9426501d3bf62f72849d5cbcd5a3", GitTreeState:"clean", BuildDate:"2017-11-20T05:17:43Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"linux/amd64"}

kubeProxyVersion: v1.8.4
kubeletVersion: v1.8.4
KOPS version: Version 1.8.0

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:13 (1 by maintainers)

github_iconTop GitHub Comments

2reactions
dustyketchumcommented, Apr 12, 2018

@minghsieh-prenetics thanks, our subnets already had internet access, this wasn’t our issue. My earlier message assumed that AWS networking was set up ‘properly’ w/ NATs, internet access available in private VPCs, etc. though I didn’t explicitly state all that.

I believe the first problem is the instructions assume you have created a publicly available kubernetes cluster or you’re using ec2 classic without a vpc (or perhaps both) - in either case, that assumption should be explicitly documented. This cloudformation template won’t work as is for anyone with a cluster in a private network in a vpc.

This was my first exposure to lambda which made troubleshooting more challenging. I believe the changes I needed to make were, in order:

  1. I had the wrong api endpoint, I had failed to remove the leading ‘api’ from the fqdn when I passed it to cloudformation. The readme says “enter only the subdomain and omit ‘api’” but the parameter description in the cloudformation template is missing this information. Cloudwatch logs helped here, I was able to see the wrong endpoint in the logs.
  2. Assign the AWSLambdaVPCAccessExecutionRole IAM policy to the new ‘…codepipelinelambdarole…’ role created by the cloudformation template.
  3. Update the new ‘…Pipeline-xxxx-LambdaKubernetesDeployme…’ lambda function created by the cloudformation template to add it to the correct vpc, add it to the correct subnets (the same subnets as your kube api endpoint), and add it to the correct security group(s) - SGs capable of using SSL to communication with the kube api endpoint). If you try to assign the lambda function to the VPC (this step) before adding the AWSLambdaVPCAccessExecutionRole policy to your IAM role (prior step above), you get a nice helpful error message telling you exactly what you need to do, but what isn’t necessary obvious unless you’ve worked with lambda before is that the lambda function does NOT get added to the vpc when you see that error…
  4. Changing the DeploymentName parameter in the cloudformation template does not seem to work, leave the default. Cloudwatch logs again helped here.

The cloudformation template could be updated to handle items 2 and 3 without too much trouble (ask for the vpc, subnet(s), and security group as cloudformation parameters).

Thanks, Dusty

1reaction
ghostcommented, Mar 24, 2018

@dustyketchum

I saw you missing:

  1. the subnet assigned must be on PRIVATE subnet within the same VPC as k8s, even though your k8s is located in the public subnet.
  2. the private subnet must have NAT (NAT gateway or NAT instance) and proper routing.
  3. Lambda function must be assigned to in private subnet.

Below is the actual architecture diagram, although we use Github not CodeCommit.

workflowdetail

@omarlari

Actually I really don’t know if EKS would change everything, and consequently CodeDeploy would have options to deploy to EKS. In that case, contributors might think about “why I need to work on something which will be soon updated?”

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why do I get hostname: Name or service not known error?
The message: hostname: Name or service not known can be caused by such a device ... When your computer checks the associated dns...
Read more >
I got error "hostname: Name or service not known" when ...
It means that "the system" (I'm using that term in a broad, general, ambiguous sense) doesn't know that the name ubuntu14-graphite ...
Read more >
Why Do I Get the Error "Name or service not known" When I ...
A public domain name configured for an ECS fails to be pinged, and the error message "Name or service not known" is displayed....
Read more >
System registration fails with "Name or service not known"
Start the messagebus and haldaemon services, if not running. · Check /etc/resolv.conf file to see if the DNS server entry is present in...
Read more >
Unable to resolve host {hostname}: Name or service not ...
The root cause of the error is actually related to the hostname changing. Let's now show how to fix this unable to resolve...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found