Broken cert expiration check for http_check with tls_verify disabled
See original GitHub issue===============
Agent (v6.33.0)
===============
Status date: 2022-03-01 19:02:19.055 UTC (1646161339055)
Agent start: 2022-03-01 18:17:47.42 UTC (1646158667420)
Pid: 411
Go Version: go1.16.7
Python Version: 2.7.18
Build arch: amd64
Agent flavor: agent
Check Runners: 5
Log File: /mnt/log/agent.log
Log Level: INFO
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
NTP offset: -116µs
System time: 2022-03-01 19:02:19.055 UTC (1646161339055)
Host Info
=========
bootTime: 2021-05-24 13:16:00 UTC (1621862160000)
kernelArch: x86_64
kernelVersion: 4.14.231-173.361.amzn2.x86_64
os: linux
platform: ubuntu
platformFamily: debian
platformVersion: 21.10
procs: 167
uptime: 6749h1m55s
Hostnames
=========
cluster-name: k8s-test
ec2-hostname: ip-10-3-38-250.us-west-2.compute.internal
host_aliases: [ip-10-3-38-250.us-west-2.compute.internal-k8s-test]
hostname: i-04c046d812156653x
instance-id: i-04c046d812156653x
socket-fqdn: ip-10-3-38-250.us-west-2.compute.internal.
socket-hostname: ip-10-3-38-250.us-west-2.compute.internal
host tags:
cloud:AWS
<redacted>
hostname provider: aws
unused hostname providers:
azure: azure_hostname_style is set to 'os'
configuration/environment: hostname is empty
gce: unable to retrieve hostname from GCE: GCE metadata API error: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname
Metadata
========
agent_version: 6.33.0
cloud_provider: AWS
config_apm_dd_url:
config_dd_url:
config_logs_dd_url:
config_logs_socks5_proxy_address:
config_no_proxy: []
config_process_dd_url:
config_proxy_http:
config_proxy_https:
config_site:
feature_apm_enabled: true
feature_cspm_enabled: false
feature_cws_enabled: false
feature_logs_enabled: false
feature_networks_enabled: false
feature_process_enabled: true
flavor: agent
hostname_source: aws
install_method_installer_version: docker
install_method_tool: docker
install_method_tool_version: docker
=========
Collector
=========
Running Checks
==============
container
---------
Instance ID: container [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/container.d/conf.yaml.default
Total Runs: 178
Metric Samples: Last Run: 231, Total: 41,118
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 6ms
Last Execution Date : 2022-03-01 19:02:11 UTC (1646161331000)
Last Successful Execution Date : 2022-03-01 19:02:11 UTC (1646161331000)
cpu
---
Instance ID: cpu [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default
Total Runs: 178
Metric Samples: Last Run: 9, Total: 1,595
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-03-01 19:02:18 UTC (1646161338000)
Last Successful Execution Date : 2022-03-01 19:02:18 UTC (1646161338000)
disk (4.5.1)
------------
Instance ID: disk:b2cf39b497091ec9 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/disk.yaml
Total Runs: 178
Metric Samples: Last Run: 84, Total: 14,952
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 67ms
Last Execution Date : 2022-03-01 19:02:10 UTC (1646161330000)
Last Successful Execution Date : 2022-03-01 19:02:10 UTC (1646161330000)
dns_check (2.1.0)
-----------------
Instance ID: dns_check:google-check-via-kube-dns:478dd776df194c16 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/dns_check.d/conf.yaml
Total Runs: 178
Metric Samples: Last Run: 1, Total: 178
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 178
Average Execution Time : 1ms
Last Execution Date : 2022-03-01 19:02:17 UTC (1646161337000)
Last Successful Execution Date : 2022-03-01 19:02:17 UTC (1646161337000)
Instance ID: dns_check:google-check-via-public-cloudflare-dns:7018628659fe2e44 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/dns_check.d/conf.yaml
Total Runs: 178
Metric Samples: Last Run: 1, Total: 178
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 178
Average Execution Time : 9ms
Last Execution Date : 2022-03-01 19:02:16 UTC (1646161336000)
Last Successful Execution Date : 2022-03-01 19:02:16 UTC (1646161336000)
Instance ID: dns_check:google-check-via-public-google-dns:f6b991be82834544 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/dns_check.d/conf.yaml
Total Runs: 177
Metric Samples: Last Run: 1, Total: 177
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 177
Average Execution Time : 8ms
Last Execution Date : 2022-03-01 19:02:08 UTC (1646161328000)
Last Successful Execution Date : 2022-03-01 19:02:08 UTC (1646161328000)
Instance ID: dns_check:k8s-default-via-kube-dns:51ca8513f09e2997 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/dns_check.d/conf.yaml
Total Runs: 177
Metric Samples: Last Run: 1, Total: 177
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 177
Average Execution Time : 3ms
Last Execution Date : 2022-03-01 19:02:09 UTC (1646161329000)
Last Successful Execution Date : 2022-03-01 19:02:09 UTC (1646161329000)
file_handle
-----------
Instance ID: file_handle [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default
Total Runs: 178
Metric Samples: Last Run: 5, Total: 890
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-03-01 19:02:10 UTC (1646161330000)
Last Successful Execution Date : 2022-03-01 19:02:10 UTC (1646161330000)
filebeat (unversioned)
----------------------
Instance ID: filebeat:594c29c48fdd43de [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/filebeat.d/conf.yaml
Total Runs: 178
Metric Samples: Last Run: 4, Total: 708
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 178
Average Execution Time : 6ms
Last Execution Date : 2022-03-01 19:02:15 UTC (1646161335000)
Last Successful Execution Date : 2022-03-01 19:02:15 UTC (1646161335000)
http_check (6.1.2-rc.1)
-----------------------
Instance ID: http_check:Kafka Api liveness:db74e3a9e13b30f0 [ERROR]
Configuration Source: file:/etc/datadog-agent/conf.d/http_check.d/httpi.yaml
Total Runs: 177
Metric Samples: Last Run: 3, Total: 531
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 27ms
Last Execution Date : 2022-03-01 19:02:07 UTC (1646161327000)
Last Successful Execution Date : Never
Error: u'notAfter'
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py", line 1017, in run
self.check(instance)
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/http_check/http_check.py", line 240, in check
status, days_left, seconds_left, msg = self.check_cert_expiration(instance, timeout, instance_ca_certs)
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/http_check/http_check.py", line 326, in check_cert_expiration
exp_date = datetime.strptime(cert['notAfter'], "%b %d %H:%M:%S %Y %Z")
KeyError: u'notAfter'
Instance ID: http_check:kubeapi_server_health_check:a9578cd2db2bee0c [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/http_check.d/kubeapi-server.yaml
Total Runs: 178
Metric Samples: Last Run: 5, Total: 890
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 2, Total: 356
Average Execution Time : 48ms
Last Execution Date : 2022-03-01 19:02:14 UTC (1646161334000)
Last Successful Execution Date : 2022-03-01 19:02:14 UTC (1646161334000)
io
--
Instance ID: io [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default
Total Runs: 178
Metric Samples: Last Run: 39, Total: 6,915
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-03-01 19:02:17 UTC (1646161337000)
Last Successful Execution Date : 2022-03-01 19:02:17 UTC (1646161337000)
jmxfetch (unversioned)
----------------------
Instance ID: jmxfetch:d884b5186b651429 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/jmxfetch.d/conf.yaml
Total Runs: 177
Metric Samples: Last Run: 2, Total: 354
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 177
Average Execution Time : 265ms
Last Execution Date : 2022-03-01 19:02:06 UTC (1646161326000)
Last Successful Execution Date : 2022-03-01 19:02:06 UTC (1646161326000)
kubelet (7.1.0)
---------------
Instance ID: kubelet:5bbc63f3938c02f4 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
Total Runs: 134
Metric Samples: Last Run: 832, Total: 112,824
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 4, Total: 536
Average Execution Time : 295ms
Last Execution Date : 2022-03-01 19:02:15 UTC (1646161335000)
Last Successful Execution Date : 2022-03-01 19:02:15 UTC (1646161335000)
kubernetes_apiserver
--------------------
Instance ID: kubernetes_apiserver [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/kubernetes_apiserver.d/conf.yaml.default
Total Runs: 177
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-03-01 19:02:09 UTC (1646161329000)
Last Successful Execution Date : 2022-03-01 19:02:09 UTC (1646161329000)
load
----
Instance ID: load [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default
Total Runs: 178
Metric Samples: Last Run: 6, Total: 1,068
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-03-01 19:02:16 UTC (1646161336000)
Last Successful Execution Date : 2022-03-01 19:02:16 UTC (1646161336000)
memory
------
Instance ID: memory [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default
Total Runs: 177
Metric Samples: Last Run: 18, Total: 3,186
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-03-01 19:02:08 UTC (1646161328000)
Last Successful Execution Date : 2022-03-01 19:02:08 UTC (1646161328000)
network (2.4.0)
---------------
Instance ID: network:d884b5186b651429 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/network.d/conf.yaml.default
Total Runs: 178
Metric Samples: Last Run: 73, Total: 12,994
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 3ms
Last Execution Date : 2022-03-01 19:02:15 UTC (1646161335000)
Last Successful Execution Date : 2022-03-01 19:02:15 UTC (1646161335000)
ntp
---
Instance ID: ntp:d884b5186b651429 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default
Total Runs: 3
Metric Samples: Last Run: 1, Total: 3
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 3
Average Execution Time : 1ms
Last Execution Date : 2022-03-01 18:47:55 UTC (1646160475000)
Last Successful Execution Date : 2022-03-01 18:47:55 UTC (1646160475000)
openmetrics (1.16.0)
--------------------
Instance ID: openmetrics:cc-cert-exporter:206dfb034ec9d8a1 [OK]
Configuration Source: kubelet:docker://c3f28337d7a24bff036bc5a782790f1ff51d2f7227d6d82b12f1044171f60a44
Total Runs: 177
Metric Samples: Last Run: 3, Total: 531
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 177
Average Execution Time : 26ms
Last Execution Date : 2022-03-01 19:02:06 UTC (1646161326000)
Last Successful Execution Date : 2022-03-01 19:02:06 UTC (1646161326000)
Instance ID: openmetrics:cc-goldpinger:b8f5b76d5edd920d [OK]
Configuration Source: kubelet:docker://bc51640f268d5fed014644b99c4c9acbe8e4cb54c335bc6e4b0b3acacc4a9b7b
Total Runs: 177
Metric Samples: Last Run: 7, Total: 1,239
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 177
Average Execution Time : 63ms
Last Execution Date : 2022-03-01 19:02:14 UTC (1646161334000)
Last Successful Execution Date : 2022-03-01 19:02:14 UTC (1646161334000)
process (2.1.1)
---------------
Instance ID: process:kube-proxy:da726e50de09c3a5 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/process.d/conf.yaml
Total Runs: 177
Metric Samples: Last Run: 18, Total: 3,184
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 177
Average Execution Time : 1ms
Last Execution Date : 2022-03-01 19:02:05 UTC (1646161325000)
Last Successful Execution Date : 2022-03-01 19:02:05 UTC (1646161325000)
Instance ID: process:kubelet:f36c694c86c7b245 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/process.d/conf.yaml
Total Runs: 178
Metric Samples: Last Run: 18, Total: 3,202
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 178
Average Execution Time : 1ms
Last Execution Date : 2022-03-01 19:02:13 UTC (1646161333000)
Last Successful Execution Date : 2022-03-01 19:02:13 UTC (1646161333000)
Instance ID: process:systemd:f976e8edf3df59c7 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/process.d/conf.yaml
Total Runs: 178
Metric Samples: Last Run: 18, Total: 3,202
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 178
Average Execution Time : 3ms
Last Execution Date : 2022-03-01 19:02:12 UTC (1646161332000)
Last Successful Execution Date : 2022-03-01 19:02:12 UTC (1646161332000)
tls (2.6.0)
-----------
Instance ID: tls:public-cert-expiration-check:198d9e47b2956736 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/tls.d/standard.yaml
Total Runs: 177
Metric Samples: Last Run: 2, Total: 354
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 4, Total: 708
Average Execution Time : 7ms
Last Execution Date : 2022-03-01 19:02:04 UTC (1646161324000)
Last Successful Execution Date : 2022-03-01 19:02:04 UTC (1646161324000)
uptime
------
Instance ID: uptime [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default
Total Runs: 177
Metric Samples: Last Run: 1, Total: 177
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-03-01 19:02:07 UTC (1646161327000)
Last Successful Execution Date : 2022-03-01 19:02:07 UTC (1646161327000)
========
JMXFetch
========
Information
==================
runtime_version : 11.0.13
version : 0.44.6
Initialized checks
==================
jmx
instance_name : http-10.3.34.32-7203
message : <no value>
metric_count : 72
service_check_count : 0
status : OK
Failed checks
=============
no checks
=========
Forwarder
=========
Transactions
============
Cluster: 0
ClusterRole: 0
ClusterRoleBinding: 0
CronJob: 0
DaemonSet: 0
Deployment: 0
Dropped: 0
HighPriorityQueueFull: 0
Job: 0
Node: 0
PersistentVolume: 0
PersistentVolumeClaim: 0
Pod: 0
ReplicaSet: 0
Requeued: 0
Retried: 0
RetryQueueSize: 0
Role: 0
RoleBinding: 0
Service: 0
ServiceAccount: 0
StatefulSet: 0
Transaction Successes
=====================
Total number: 373
Successes By Endpoint:
check_run_v1: 177
intake: 15
metadata_v1: 4
series_v1: 177
On-disk storage
===============
On-disk storage is disabled. Configure `forwarder_storage_max_size_in_bytes` to enable it.
API Keys status
===============
API key ending with <redacted>: API Key valid
==========
Endpoints
==========
https://app.datadoghq.com - API Key ending with:
- 62281
==========
Logs Agent
==========
Logs Agent is not running
=========
APM Agent
=========
Status: Running
Pid: 414
Uptime: 2671 seconds
Mem alloc: 8,889,824 bytes
Hostname: i-04c046d812156653x
Receiver: 0.0.0.0:8126
Endpoints:
https://trace.agent.datadoghq.com
Receiver (previous minute)
==========================
No traces received in the previous minute.
Default priority sampling rate: 100.0%
Writer (previous minute)
========================
Traces: 0 payloads, 0 traces, 0 events, 0 bytes
Stats: 0 payloads, 0 stats buckets, 0 bytes
=========
Aggregator
=========
Checks Metric Sample: 219,423
Dogstatsd Metric Sample: 34,379
Event: 1
Events Flushed: 1
Number Of Flushes: 177
Series Flushed: 216,047
Service Check: 8,131
Service Checks Flushed: 8,268
=========
DogStatsD
=========
Event Packets: 0
Event Parse Errors: 0
Metric Packets: 34,378
Metric Parse Errors: 0
Service Check Packets: 178
Service Check Parse Errors: 0
Udp Bytes: 12,109,080
Udp Packet Reading Errors: 0
Udp Packets: 28,271
Uds Bytes: 0
Uds Origin Detection Errors: 0
Uds Packet Reading Errors: 0
Uds Packets: 0
Unterminated Metric Errors: 0
=============
Autodiscovery
=============
Enabled Features
================
kubernetes
Configuration Errors
====================
kube-system/node-local-dns-kshnf
--------------------------------
annotation ad.datadoghq.com/kube2iam.check_names is invalid: kube2iam doesn't match a container identifier [node-cache]
annotation ad.datadoghq.com/kube2iam.init_configs is invalid: kube2iam doesn't match a container identifier [node-cache]
annotation ad.datadoghq.com/kube2iam.instances is invalid: kube2iam doesn't match a container identifier [node-cache]
Additional environment details (Operating System, Cloud provider, etc):
Using Docker image datadog/agent:6.33.0-jmx
on EKS v1.18.20
Steps to reproduce the issue:
- Run an https server in a k8s pod
- Configure dd-agent to do http_check similar to this.
ad_identifiers:
- http-server
init_config:
instances:
- name: liveness
url: "https://%%host%%:443/health"
tls_verify: false
- Run
agent check http_check
Describe the results you received:
=========
Collector
=========
Running Checks
==============
http_check (6.1.2-rc.1)
-----------------------
Instance ID: http_check:Kafka Api liveness:f8fb566fee6d98c8 [ERROR]
Configuration Source: file:/etc/datadog-agent/conf.d/http_check.d/http.yaml
Total Runs: 1
Metric Samples: Last Run: 3, Total: 3
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 37ms
Last Execution Date : 2022-03-01 16:55:21 UTC (1646153721000)
Last Successful Execution Date : Never
Error: u'notAfter'
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/base/checks/base.py", line 1017, in run
self.check(instance)
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/http_check/http_check.py", line 240, in check
status, days_left, seconds_left, msg = self.check_cert_expiration(instance, timeout, instance_ca_certs)
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/http_check/http_check.py", line 326, in check_cert_expiration
exp_date = datetime.strptime(cert['notAfter'], "%b %d %H:%M:%S %Y %Z")
KeyError: u'notAfter'
Describe the results you expected:
I expected dd-agent to be able to check the certificate expiration and populate the http.ssl.days_left
metric
Additional information you deem important (e.g. issue happens only occasionally):
This PR that went into version 6.26/7.26 broke backward compatibility for this check in dd-agent. Previously, the socket mode was hard coded to ssl.CERT_REQUIRED. That PR switched to using the TlsContextWrapper which only sets that mode if full verification is enabled. In ssl.CERT_NONE
mode, the SSL Socket will not return a peer certificate leading to this error
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
Top GitHub Comments
@sarah-witt Done. Request #703748)
@hithwen @sarah-witt There’s no meaningful action on my support case. Should we reopen this? Unless I’m missing something, dd-agent should not break backward compatibility, especially in a minor version release and needs to be fixed ASAP for all DD customers.