Unable to detect the kubelet URL automatically / cannot validate certificate
See original GitHub issueOutput of the info page
Getting the status from the agent.
==============
Agent (v6.6.0)
==============
Status date: 2018-11-13 23:10:34.603102 UTC
Pid: 342
Python Version: 2.7.15
Logs:
Check Runners: 4
Log Level: info
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
NTP offset: 1.461ms
System UTC time: 2018-11-13 23:10:34.603102 UTC
Host Info
=========
bootTime: 2018-11-08 08:50:28.000000 UTC
kernelVersion: 4.9.0-7-amd64
os: linux
platform: debian
platformFamily: debian
platformVersion: buster/sid
procs: 70
uptime: 133h51m42s
virtualizationRole: host
virtualizationSystem: kvm
Hostnames
=========
hostname: reverent-kapitsa-1us
socket-fqdn: datadog-agent-pxkhm
socket-hostname: datadog-agent-pxkhm
hostname provider: container
unused hostname providers:
aws: not retrieving hostname from AWS: the host is not an ECS instance, and other providers already retrieve non-default hostnames
configuration/environment: hostname is empty
gce: unable to retrieve hostname from GCE: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname
=========
Collector
=========
Running Checks
==============
cpu
---
Instance ID: cpu [OK]
Total Runs: 114
Metric Samples: 6, Total: 678
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 0s
disk (1.4.0)
------------
Instance ID: disk:e5dffb8bef24336f [OK]
Total Runs: 114
Metric Samples: 190, Total: 21,660
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 197ms
docker
------
Instance ID: docker [OK]
Total Runs: 113
Metric Samples: 216, Total: 23,850
Events: 0, Total: 6
Service Checks: 1, Total: 113
Average Execution Time : 203ms
file_handle
-----------
Instance ID: file_handle [OK]
Total Runs: 114
Metric Samples: 5, Total: 570
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 0s
io
--
Instance ID: io [OK]
Total Runs: 113
Metric Samples: 39, Total: 4,380
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 0s
kubelet (2.2.0)
---------------
Instance ID: kubelet:d884b5186b651429 [ERROR]
Total Runs: 114
Metric Samples: 0, Total: 0
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 8ms
Error: Unable to detect the kubelet URL automatically.
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py", line 366, in run
self.check(copy.deepcopy(self.instances[0]))
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/kubelet/kubelet.py", line 113, in check
raise CheckException("Unable to detect the kubelet URL automatically.")
CheckException: Unable to detect the kubelet URL automatically.
kubernetes_apiserver
--------------------
Instance ID: kubernetes_apiserver [OK]
Total Runs: 113
Metric Samples: 0, Total: 0
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 11ms
load
----
Instance ID: load [OK]
Total Runs: 114
Metric Samples: 6, Total: 684
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 2ms
memory
------
Instance ID: memory [OK]
Total Runs: 113
Metric Samples: 17, Total: 1,921
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 0s
network (1.7.0)
---------------
Instance ID: network:2a218184ebe03606 [OK]
Total Runs: 114
Metric Samples: 74, Total: 8,754
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 9ms
ntp
---
Instance ID: ntp:b4579e02d1981c12 [OK]
Total Runs: 113
Metric Samples: 1, Total: 113
Events: 0, Total: 0
Service Checks: 1, Total: 113
Average Execution Time : 2ms
uptime
------
Instance ID: uptime [OK]
Total Runs: 114
Metric Samples: 1, Total: 114
Events: 0, Total: 0
Service Checks: 0, Total: 0
Average Execution Time : 2ms
========
JMXFetch
========
Initialized checks
==================
no checks
Failed checks
=============
no checks
=========
Forwarder
=========
CheckRunsV1: 113
Dropped: 0
DroppedOnInput: 0
Events: 0
HostMetadata: 0
IntakeV1: 11
Metadata: 0
Requeued: 0
Retried: 0
RetryQueueSize: 0
Series: 0
ServiceChecks: 0
SketchSeries: 0
Success: 237
TimeseriesV1: 113
API Keys status
===============
API key ending with 1ed66 on endpoint https://app.datadoghq.com: API Key valid
==========
Logs Agent
==========
container_collect_all
---------------------
Type: docker
Status: Pending
=========
DogStatsD
=========
Checks Metric Sample: 65,227
Event: 7
Events Flushed: 7
Number Of Flushes: 113
Series Flushed: 53,494
Service Check: 1,478
Service Checks Flushed: 1,578
Dogstatsd Metric Sample: 11,877
Additional environment details (Operating System, Cloud provider, etc):
Kubernetes 1.12 cluster on DigitalOcean.
Steps to reproduce the issue:
- Deploy the Datadog agent using the provider Kubernetes resources.
- View logs
Describe the results you received:
[ AGENT ] 2018-11-13 22:42:31 UTC | ERROR | (kubeutil.go:50 in GetKubeletConnectionInfo) | connection to kubelet failed: temporary failure in kubeutil, will retry later: try delay not elapsed yet
[ AGENT ] 2018-11-13 22:42:31 UTC | ERROR | (runner.go:289 in work) | Error running check kubelet: [{"message": "Unable to detect the kubelet URL automatically.", "traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py\", line 366, in run\n self.check(copy.deepcopy(self.instances[0]))\n File \"/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/kubelet/kubelet.py\", line 113, in check\n raise CheckException(\"Unable to detect the kubelet URL automatically.\")\nCheckException: Unable to detect the kubelet URL automatically.\n"}]
[...]
[ AGENT ] 2018-11-13 22:42:39 UTC | ERROR | (autoconfig.go:608 in collect) | Unable to collect configurations from provider Kubernetes: temporary failure in kubeutil, will retry later: cannot connect: https: "Get https://10.133.78.180:10250/pods: x509: cannot validate certificate for 10.133.78.180 because it doesn't contain any IP SANs", http: "Get http://10.133.78.180:10255/pods: dial tcp 10.133.78.180:10255: connect: connection refused"
[ AGENT ] 2018-11-13 22:42:39 UTC | INFO | (autoconfig.go:362 in initListenerCandidates) | kubelet listener cannot start, will retry: temporary failure in kubeutil, will retry later: cannot connect: https: "Get https://10.133.78.180:10250/pods: x509: cannot validate certificate for 10.133.78.180 because it doesn't contain any IP SANs", http: "Get http://10.133.78.180:10255/pods: dial tcp 10.133.78.180:10255: connect: connection refused"
Many dashboard entries remain empty.
Describe the results you expected:
No errors, access to kubelet, functional Kubernetes dashboard.
Additional information you deem important (e.g. issue happens only occasionally):
Seems to be the same problem as #1829, however that issue is closed. Hosted Kubernetes services like DigitalOcean do not allow editing the kubelet configuration as far as I know.
Issue Analytics
- State:
- Created 5 years ago
- Reactions:32
- Comments:73 (10 by maintainers)
Top Results From Across the Web
Unable to detect the kubelet URL automatically - Stack Overflow
This was a issue with deployed DataDog daemonset for me: What I did to resolve: Check daemonset if it exists or not:
Read more >Unable to detect the kubelet URL automatically / cannot ...
Unable to detect the kubelet URL automatically / cannot validate certificate.
Read more >DataDog Docker Example Doesn't Work - Render community
Errors of the form "Unable to detect the kubelet URL automatically: impossible to reach Kubelet with host: can be safely ignored. The kubelet...
Read more >How To Detect The Kubelet Url Automatically - ADocLib
Describe what happened: The cluster agent fails to detect the Kubernetes ... Unable to detect the kubelet URL automatically / cannot validate certificate....
Read more >Kubelet - Datadog Docs
The Kubelet check is included in the Datadog Agent package, ... For network metrics reported at the pod level, containers cannot be excluded...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@mjhuber I opened a ticket on the Datadog issue tracker. Advice was to set
DD_KUBELET_TLS_VERIFY=false
for now. Hopefully DO will start using real certificates for the Kubelet API.For anyone having this issue, I have been working through this with Datadogs Support Team.
It appears that AKS has changed the location of the Kubelet Client CA Cert, at least between ASK 1.16.7 and 1.16.9.
The certificate used by AKS is now located on the node at
/etc/kubernetes/certs/kubeletserver.crt
.If you are use the helm charts you can set the following values and the new certificates should get loaded correctly.
After adding and deploying this config, my Datadog agents, Helm 2.3.18 and DockerImage 7.20.2, on AKS 1.16.9 is now working correctly. There are warnings thrown by the agent that the Certificate has no subjectAltName, but metrics are able to be sent to Datadog successfully.
Hopefully a more permanent fix will follow, but this is a good enough fix for now to work again, without having to set
DD_KUBELET_TLS_VERIFY=false
.