question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Consul checks failing with ACL and TLS enabled when using K8s Autodiscovery and Secrets Management

See original GitHub issue

I’m using Consul & Datadog on Kubernetes with autodiscovery and ENC[] secrets feature. I deploy everything with Terraform (AWS, Kubernetes & Helm Providers).

It looks like X-Consul-Token is not passed correctly from encrypted values. When I replaced it with plaintext token it seems to be working.

Second issue is that default checks are running alongside checks I defined explicitly via annotations.

Debug info

Output of the info page

Getting the status from the agent.

===============
Agent (v7.19.0)
===============

  Status date: 2020-05-27 11:58:57.172295 UTC
  Agent start: 2020-05-27 10:00:46.490660 UTC
  Pid: 1
  Go Version: go1.13.8
  Python Version: 3.8.1
  Build arch: amd64
  Check Runners: 4
  Log Level: INFO

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -177µs
    System UTC time: 2020-05-27 11:58:57.172295 UTC

  Host Info
  =========
    bootTime: 2020-05-23 14:08:58.000000 UTC
    kernelVersion: 4.14.177-139.253.amzn2.x86_64
    os: linux
    platform: debian
    platformFamily: debian
    platformVersion: bullseye/sid
    procs: 60
    uptime: 91h51m53s
    virtualizationRole: guest
    virtualizationSystem: xen

  Hostnames
  =========
    ec2-hostname: masked
    host_aliases: [masked-internal]
    hostname: i-09c9206a947a6ca48
    instance-id: i-09c9206a947a6ca48
    socket-fqdn: datadog-5gwph
    socket-hostname: datadog-5gwph
    host tags:
      cluster_env:prod
      cluster_name:internal
      cluster_name:internal
    hostname provider: aws
    unused hostname providers:
      configuration/environment: hostname is empty
      gce: unable to retrieve hostname from GCE: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname

  Metadata
  ========
    cloud_provider: AWS
    hostname_source: aws

=========
Collector
=========

  Running Checks
  ==============
    
    consul (1.13.0)
    ---------------
      Instance ID: consul:6a8b38227309fa30 [ERROR]
      Configuration Source: kubelet:docker://c0ff3b2859617a5a403e4f16ac5a5df99aad46f4fe1797f2a03700079dc81f81
      Total Runs: 12
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 12
      Average Execution Time : 14ms
      Last Execution Date : 2020-05-27 11:58:54.000000 UTC
      Last Successful Execution Date : Never
      Error: 403 Client Error: Forbidden for url: https://10.15.5.140:8501/v1/agent/self
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py", line 820, in run
          self.check(instance)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 244, in check
          self._collect_metadata()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 528, in _collect_metadata
          local_config = self._get_local_config()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 103, in _get_local_config
          self._local_config = self.consul_request('/v1/agent/self')
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 76, in consul_request
          resp.raise_for_status()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/models.py", line 940, in raise_for_status
          raise HTTPError(http_error_msg, response=self)
      requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://10.15.5.140:8501/v1/agent/self
      Instance ID: consul:8912e99ffe795c0c [OK]
      Configuration Source: kubelet:docker://f754c6add1f47b72d69e0851e3da20c88e81adac236558c2a73ab2e6ed7d111f
      Total Runs: 21
      Metric Samples: Last Run: 1, Total: 21
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 2, Total: 43
      Average Execution Time : 164ms
      Last Execution Date : 2020-05-27 11:58:55.000000 UTC
      Last Successful Execution Date : 2020-05-27 11:58:55.000000 UTC
      metadata:
        version.major: 1
        version.minor: 8
        version.patch: 0
        version.raw: 1.8.0
        version.scheme: semver
      
      Instance ID: consul:93e60a3b2d57d7a2 [ERROR]
      Configuration Source: file:/etc/datadog-agent/conf.d/consul.d/auto_conf.yaml
      Total Runs: 473
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 473
      Average Execution Time : 8ms
      Last Execution Date : 2020-05-27 11:58:51.000000 UTC
      Last Successful Execution Date : Never
      Error: HTTPConnectionPool(host='10.15.5.127', port=8500): Max retries exceeded with url: /v1/status/leader (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f492c092fa0>: Failed to establish a new connection: [Errno 111] Connection refused'))
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
          conn = connection.create_connection(
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection
          raise err
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection
          sock.connect(sa)
      ConnectionRefusedError: [Errno 111] Connection refused
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
          httplib_response = self._make_request(
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 392, in _make_request
          conn.request(method, url, **httplib_request_kw)
        File "/opt/datadog-agent/embedded/lib/python3.8/http/client.py", line 1230, in request
          self._send_request(method, url, body, headers, encode_chunked)
        File "/opt/datadog-agent/embedded/lib/python3.8/http/client.py", line 1276, in _send_request
          self.endheaders(body, encode_chunked=encode_chunked)
        File "/opt/datadog-agent/embedded/lib/python3.8/http/client.py", line 1225, in endheaders
          self._send_output(message_body, encode_chunked=encode_chunked)
        File "/opt/datadog-agent/embedded/lib/python3.8/http/client.py", line 1004, in _send_output
          self.send(msg)
        File "/opt/datadog-agent/embedded/lib/python3.8/http/client.py", line 944, in send
          self.connect()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect
          conn = self._new_conn()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn
          raise NewConnectionError(
      urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f492c092fa0>: Failed to establish a new connection: [Errno 111] Connection refused
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
          resp = conn.urlopen(
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 724, in urlopen
          retries = retries.increment(
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py", line 439, in increment
          raise MaxRetryError(_pool, url, error or ResponseError(cause))
      urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='10.15.5.127', port=8500): Max retries exceeded with url: /v1/status/leader (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f492c092fa0>: Failed to establish a new connection: [Errno 111] Connection refused'))
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py", line 820, in run
          self.check(instance)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 243, in check
          self._check_for_leader_change()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 152, in _check_for_leader_change
          leader = self._get_cluster_leader()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 109, in _get_cluster_leader
          return self.consul_request('/v1/status/leader')
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 74, in consul_request
          resp = self.http.get(url)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 283, in get
          return self._request('get', url, options)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 325, in _request
          return getattr(requests, method)(url, **new_options)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 75, in get
          return request('get', url, params=params, **kwargs)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 60, in request
          return session.request(method=method, url=url, **kwargs)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 533, in request
          resp = self.send(prep, **send_kwargs)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 646, in send
          r = adapter.send(request, **kwargs)
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 516, in send
          raise ConnectionError(e, request=request)
      requests.exceptions.ConnectionError: HTTPConnectionPool(host='10.15.5.127', port=8500): Max retries exceeded with url: /v1/status/leader (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f492c092fa0>: Failed to establish a new connection: [Errno 111] Connection refused'))
    
========
JMXFetch
========

  Initialized checks
  ==================
    no checks
    
  Failed checks
  =============
    no checks
    
=========
Forwarder
=========

  Transactions
  ============
    CheckRunsV1: 472
    Connections: 0
    Containers: 0
    Dropped: 0
    DroppedOnInput: 0
    Events: 0
    HostMetadata: 0
    IntakeV1: 60
    Metadata: 0
    Pods: 0
    Processes: 0
    RTContainers: 0
    RTProcesses: 0
    Requeued: 0
    Retried: 0
    RetryQueueSize: 0
    Series: 0
    ServiceChecks: 0
    SketchSeries: 0
    Success: 1,004
    TimeseriesV1: 472

  API Keys status
  ===============
    API key ending with xxxxx: API Key valid

==========
Endpoints
==========
  https://app.datadoghq.eu - API Key ending with:
      - xxxxx

==========
Logs Agent
==========
    Sending compressed logs in HTTPS to agent-http-intake.logs.datadoghq.eu on port 443
    BytesSent: 4.0940631e+07
    EncodedBytesSent: 3.099292e+06
    LogsProcessed: 48203
    LogsSent: 48135

  datadog/datadog-5gwph/process-agent
  -----------------------------------
    Type: file
    Path: /var/log/pods/datadog_datadog-5gwph_415ce591-db39-4514-b36f-05de797b87b3/process-agent/*.log
    Status: Pending
      1 files tailed out of 1 files matching

  consul/consul-mesh-gateway-dcf8564c9-9zqsf/service-init
  -------------------------------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-mesh-gateway-dcf8564c9-9zqsf_ef815eee-90f5-42e2-bdc7-e148bebbe6cc/service-init/*.log
    Status: Pending
      1 files tailed out of 1 files matching

  consul/consul-mesh-gateway-dcf8564c9-9zqsf/get-auto-encrypt-client-ca
  ---------------------------------------------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-mesh-gateway-dcf8564c9-9zqsf_ef815eee-90f5-42e2-bdc7-e148bebbe6cc/get-auto-encrypt-client-ca/*.log
    Status: Pending
      1 files tailed out of 1 files matching

  consul/consul-jstnd/client-acl-init
  -----------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-jstnd_faec1b8e-82d3-4f55-9920-61ebc8754762/client-acl-init/*.log
    Status: OK
      1 files tailed out of 1 files matching
    Inputs: /var/log/pods/consul_consul-jstnd_faec1b8e-82d3-4f55-9920-61ebc8754762/client-acl-init/0.log 

  consul/consul-mesh-gateway-dcf8564c9-9zqsf/copy-consul-bin
  ----------------------------------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-mesh-gateway-dcf8564c9-9zqsf_ef815eee-90f5-42e2-bdc7-e148bebbe6cc/copy-consul-bin/*.log
    Status: OK
      1 files tailed out of 1 files matching
    Inputs: /var/log/pods/consul_consul-mesh-gateway-dcf8564c9-9zqsf_ef815eee-90f5-42e2-bdc7-e148bebbe6cc/copy-consul-bin/0.log 

  consul/consul-server-0/consul
  -----------------------------
    Type: file
    Path: /var/log/pods/consul_consul-server-0_5cf038b3-1d69-459c-926a-41e14d6a8f48/consul/*.log
    Status: Pending
      1 files tailed out of 1 files matching

  consul/consul-connect-injector-webhook-deployment-7bf98cdc6c-zvlxq/get-auto-encrypt-client-ca
  ---------------------------------------------------------------------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-connect-injector-webhook-deployment-7bf98cdc6c-zvlxq_e90d06aa-e972-4141-ae9b-12189ca1d64a/get-auto-encrypt-client-ca/*.log
    Status: Pending
      1 files tailed out of 1 files matching

  consul/consul-connect-injector-webhook-deployment-7bf98cdc6c-zvlxq/sidecar-injector
  -----------------------------------------------------------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-connect-injector-webhook-deployment-7bf98cdc6c-zvlxq_e90d06aa-e972-4141-ae9b-12189ca1d64a/sidecar-injector/*.log
    Status: Pending
      1 files tailed out of 1 files matching

  consul/consul-mesh-gateway-dcf8564c9-9zqsf/lifecycle-sidecar
  ------------------------------------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-mesh-gateway-dcf8564c9-9zqsf_ef815eee-90f5-42e2-bdc7-e148bebbe6cc/lifecycle-sidecar/*.log
    Status: Pending
      1 files tailed out of 1 files matching

  consul/consul-jstnd/consul
  --------------------------
    Type: file
    Path: /var/log/pods/consul_consul-jstnd_faec1b8e-82d3-4f55-9920-61ebc8754762/consul/*.log
    Status: OK
      1 files tailed out of 1 files matching
    Inputs: /var/log/pods/consul_consul-jstnd_faec1b8e-82d3-4f55-9920-61ebc8754762/consul/0.log 

  consul/consul-mesh-gateway-dcf8564c9-9zqsf/mesh-gateway
  -------------------------------------------------------
    Type: file
    Path: /var/log/pods/consul_consul-mesh-gateway-dcf8564c9-9zqsf_ef815eee-90f5-42e2-bdc7-e148bebbe6cc/mesh-gateway/*.log
    Status: Pending
      1 files tailed out of 1 files matching

============
System Probe
============
  System Probe is not running:

    Errors
    ======
    error setting up remote system probe util, socket path does not exist: stat /opt/datadog-agent/run/sysprobe.sock: no such file or directory

=========
Aggregator
=========
  Checks Metric Sample: 660,057
  Dogstatsd Metric Sample: 1,181
  Event: 10
  Events Flushed: 10
  Number Of Flushes: 472
  Series Flushed: 505,828
  Service Check: 11,266
  Service Checks Flushed: 11,720

=========
DogStatsD
=========
  Event Packets: 0
  Event Parse Errors: 0
  Metric Packets: 1,180
  Metric Parse Errors: 0
  Service Check Packets: 0
  Service Check Parse Errors: 0
  Udp Bytes: 65,372
  Udp Packet Reading Errors: 0
  Udp Packets: 1,181
  Uds Bytes: 0
  Uds Origin Detection Errors: 0
  Uds Packet Reading Errors: 0
  Uds Packets: 0

=====================
Datadog Cluster Agent
=====================

  - Datadog Cluster Agent endpoint detected: https://172.20.245.213:5005
  Successfully connected to the Datadog Cluster Agent.
  - Running: 1.5.2+commit.60ee741

agent secret output

Defaulting container name to agent.
Use 'kubectl describe pod/datadog-5gwph -n datadog' to see all of the containers in this pod.
=== Checking executable rights ===
Executable path: /readsecret.py
Check Rights: OK, the executable has the correct rights

Rights Detail:
file mode: 100500
Owner username: root
Group name: root

=== Secrets stats ===
Number of secrets decrypted: 1
Secrets handle decrypted:
- consul_acl_token: from consul

Additional environment details (Operating System, Cloud provider, etc): Consul 1.8-beta2, Datadog from Helm Chart v2.3.5

Steps to reproduce the issue:

  1. Configure Consul with ACL and TLS enabled on Kubernetes
  2. Try to configure Consul checks as follows

My Consul annotations:

ad.datadoghq.com/consul.logs: '[{ "source":"consul", "service":"consul" }]'
ad.datadoghq.com/consul.init_configs: '[{}]'
ad.datadoghq.com/consul.check_names: '["consul"]'
ad.datadoghq.com/consul.instances: |
  [{
    "url": "https://%%host%%:8501",
    "acl_token": "ENC[consul_acl_token]",
    "tls_verify": false,
    "tls_ignore_warning": true
  }]

I mounted Kubernetes secret into Datadog Agents (Helm setup) and configured secret support as follows:

env:
- name: DD_SECRET_BACKEND_COMMAND
  value: /readsecret.py
- name: DD_SECRET_BACKEND_ARGUMENTS
  value: "/etc/datadog-secrets"

volumes:
- name: datadog-secrets
  secret:
    secretName: datadog-secrets
volumeMounts:
- name: datadog-secrets
  mountPath: "/etc/datadog-secrets"
  readOnly: true

I configured ACL in Consul and exported it to Kubernetes with Terraform:

resource "consul_acl_policy" "monitoring" {
  name  = "monitoring"
  description = "Datadog Monitoring Policy"
  rules = <<-HCL
    event_prefix "" {
      policy = "read"
    }
    agent_prefix "" {
      policy = "read"
    }
    node_prefix "" {
      policy = "read"
    }
    service_prefix "" {
      policy = "read"
    }
  HCL
}

resource "consul_acl_token" "datadog" {
  description = "Datadog"
  policies = [consul_acl_policy.monitoring.name]
}

data "consul_acl_token_secret_id" "datadog" {
  accessor_id = consul_acl_token.datadog.accessor_id
}

resource "kubernetes_secret" "datadog_secrets" {
  metadata {
    namespace = module.datadog.namespace
    name = "datadog-secrets"
  }

  data = {
    consul_acl_token = data.consul_acl_token_secret_id.datadog.secret_id
  }
}

Describe the results you received:

Logs from Consul

2020-05-27T12:12:30.444Z [ERROR] agent.http: Request error: method=GET url=/v1/agent/self from=10.15.6.232:41628 error="ACL not found"

Logs from Datadog agent

Error running check consul: [{"message": "403 Client Error: Forbidden for url: https://10.15.4.223:8501/v1/agent/self", "traceback": "Traceback (most recent call last):
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 820, in run
    self.check(instance)
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py\", line 244, in check
    self._collect_metadata()
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py\", line 528, in _collect_metadata
    local_config = self._get_local_config()
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py\", line 103, in _get_local_config
    self._local_config = self.consul_request('/v1/agent/self')
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py\", line 76, in consul_request
    resp.raise_for_status()
  File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/models.py\", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://10.15.4.223:8501/v1/agent/self
"}]

Additionally, when I replace ENC[consul_acl_token] with plaintext token:

 Error running check consul: [{"message": "HTTPConnectionPool(host='10.15.4.37', port=8500): Max retries exceeded with url: /v1/status/leader (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb812f11d90>: Failed to establish a new connection: [Errno 111] Connection refused'))", "traceback": "Traceback (most recent call last):

Describe the results you expected: No errors when running checks with ENC[…]. Only checks defined by annotations should be run. Notice that I get errors with connecting to Consul on http@8500, while I’m explicitly passing https@8501.

Additional information you deem important (e.g. issue happens only occasionally): I went through https://docs.datadoghq.com/agent/guide/secrets-management/?tab=linux#troubleshooting and there’s one thing that does not match expected results in guide:

When I run agent configcheck I get:

=== consul check ===
Configuration provider: kubernetes
Configuration source: kubelet:docker://7ec881bfcd6073c234d56a9a03c58161fd03c17a50cbaeef985610684cde0ead
Instance ID: consul:6a74708f3e6fa3c6
acl_token: ********
tags:
- kube_namespace:consul
- kube_stateful_set:consul-server
- kube_container_name:consul
- kube_service:consul-ui
- kube_service:consul-dns
- docker_image:consul:1.8.0-beta2
- short_image:consul
- image_tag:1.8.0-beta2
- persistentvolumeclaim:data-consul-consul-server-0
- kube_service:consul-server
- image_name:consul
- pod_phase:running
tls_ignore_warning: true
tls_verify: false
url: https://10.15.5.241:8501
~
Init Config:
{}
Auto-discovery IDs:
* docker://7ec881bfcd6073c234d56a9a03c58161fd03c17a50cbaeef985610684cde0ead
===
=== consul check ===
Configuration provider: kubernetes
Configuration source: kubelet:docker://d138d03e36f46c31d31bde77b0f439c0d1ada553425d0d739aeffcefdcfb8d0c
Instance ID: consul:5bd82da03f1c7903
acl_token: ********
tags:
- kube_namespace:consul
- short_image:consul
- pod_phase:running
- kube_stateful_set:consul-server
- kube_container_name:consul
- image_tag:1.8.0-beta2
- kube_service:consul-server
- kube_service:consul-ui
- persistentvolumeclaim:data-consul-consul-server-0
- docker_image:consul:1.8.0-beta2
- image_name:consul
- kube_service:consul-dns
tls_ignore_warning: true
tls_verify: false
url: https://10.15.5.228:8501
~
Init Config:
{}
Auto-discovery IDs:
* docker://d138d03e36f46c31d31bde77b0f439c0d1ada553425d0d739aeffcefdcfb8d0c
===
=== consul check ===
Configuration provider: file
Configuration source: file:/etc/datadog-agent/conf.d/consul.d/auto_conf.yaml
Instance ID: consul:93e60a3b2d57d7a2
catalog_checks: true
new_leader_checks: true
tags:
- kube_container_name:copy-consul-bin
- short_image:consul
- image_tag:1.8.0-beta2
- docker_image:consul:1.8.0-beta2
- image_name:consul
- kube_namespace:consul
- pod_phase:running
- kube_deployment:consul-mesh-gateway
url: http://10.15.5.127:8500
~
Auto-discovery IDs:
* consul
===
=== consul check ===
Configuration provider: kubernetes
Configuration source: kubelet:docker://b052fa34d8aad72b7402fb1bc6e3ebf16cd0e8057e1ce4db59af9dac14762ea3
Instance ID: consul:b3d726e8a50f827f
acl_token: ********
tags:
- image_tag:1.8.0-beta2
- kube_daemon_set:consul
- image_name:consul
- kube_service:consul-dns
- pod_phase:running
- docker_image:consul:1.8.0-beta2
- short_image:consul
- kube_namespace:consul
- kube_container_name:consul
tls_ignore_warning: true
tls_verify: false
url: https://10.15.5.228:8501
~

Note that checks from /etc/datadog-agent/conf.d/consul.d/auto_conf.yaml are present, which is not expected. I also see that acl_token shows up as ******** and not as decrypted value of token. According to docs, that value should show up as decrypted value.

sudo -u dd-agent -- datadog-agent configcheck

=== a check ===
Source: File Configuration Provider
Instance 1:
host: <decrypted_host>
port: <decrypted_port>
password: <decrypted_password>
~
===

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:6 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
FlorianVeauxcommented, May 28, 2020

As far as I can see, this is not a bug with the consul integration. Integrations are not in charge of decrypting secrets so if the integration works with the token as plain text it seems to work as expected. Also the core agent is able to decrypt the secret so 🤔 ?

But maybe you can try different agent versions and see if there is a regression with one of the versions? As you said GitHub is a valid place for bug reports, but this is still unclear whether or not this is a bug.

If you want to debug for yourself, one thing you can try is to run the consul check manually with this command: agent check consul -b 243, this will put a breakpoint and open a pdb debugger on the first line of the check method. From there you can check for yourself what the integration is doing. Especially you can print(self.http.options['headers']['X-Consul-Token']) to make sure that the consul integration is attaching your acl token as a header and assert that the token is valid.

1reaction
krzysztof-miemieccommented, May 28, 2020

Thanks for another great tip - I wasn’t aware of such powerful built-in debug feature! After running that I see no X-Consul-Token header being set, as well as no acl_token present in self.instance. It looked like a conflict between auto_conf.yaml and configuration provided by me 🤔

I applied changes regarding volume, rolled back to ENC[consul_acl_token] from hardcoded value and it looks like it’s working correctly now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

TLS server issues (Consul-k8s, vault secrets backend)
Hi all, We are having some trouble getting ACLs to work properly. When we turn off manageSystemACLs consul starts fine and there are...
Read more >
HashiCorp Consul – Index - Wilson Mar
Kubernetes and Sidecars not encrypting communications between services, Consul is becoming a popular add-on to Kubernetes Service Mesh because ...
Read more >
Secrets Management - Datadog Docs
This approach allows users to rely on any secrets management backend (such as HashiCorp Vault or AWS Secrets Manager), and select their preferred...
Read more >
YAML Configuration Settings — Patroni 2.1.5 documentation
In order to change the dynamic configuration you can use either patronictl ... SNI host when connecting via TLS, see also consul agent...
Read more >
Cluster Formation and Peer Discovery - RabbitMQ
A node that failed its health check is considered to be in the warning state by Consul. Such nodes can be automatically unregistered...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found