Connection state metrics for dockerized agent and host networking
See original GitHub issueOutput of the info page
====================
Collector (v 5.22.3)
====================
Status date: 2018-05-15 05:16:49 (1s ago)
Pid: 6095
Platform: Linux-4.14.32-coreos-x86_64-with
Python Version: 2.7.14, 64bit
Logs: <stderr>, /opt/datadog-agent/logs/collector.log
Clocks
======
NTP offset: 0.0011 s
System UTC time: 2018-05-15 05:16:51.565257
Paths
=====
conf.d: /opt/datadog-agent/agent/conf.d
checks.d: /opt/datadog-agent/agent/checks.d
Hostnames
=========
ec2-hostname: ip-10-0-2-200.ec2.internal
local-ipv4: 10.0.2.200
local-hostname: ip-10-0-2-200.ec2.internal
socket-hostname: ip-10-0-2-200.ec2.internal
public-hostname: ec2-34-229-87-92.compute-1.amazonaws.com
hostname: i-0de44ea41cc34c069
instance-id: i-0de44ea41cc34c069
public-ipv4: 34.229.87.92
socket-fqdn: 10.0.2.200
Checks
======
linux_proc_extras (1.0.0)
-------------------------
- instance #0 [ERROR]: 'get_subprocess_output expected output but had none.'
- Collected 6 metrics, 0 events & 0 service checks
network (1.4.0)
---------------
- instance #0 [WARNING]
Warning: Cannot collect connection state: currently with a custom /proc path: /host/proc/1
- Collected 20 metrics, 0 events & 0 service checks
ntp (1.0.0)
-----------
- Collected 0 metrics, 0 events & 0 service checks
cassandra_nodetool (0.1.1)
--------------------------
- instance #0 [OK]
- Collected 16 metrics, 0 events & 3 service checks
consul (1.3.0)
--------------
- instance #0 [OK]
- Collected 1 metric, 0 events & 0 service checks
disk (1.1.0)
------------
- instance #0 [OK]
- Collected 34 metrics, 0 events & 0 service checks
docker_daemon (1.8.0)
---------------------
- instance #0 [OK]
- Collected 29 metrics, 0 events & 1 service check
cassandra (5.22.3)
------------------
- instance #cassandra-localhost-7199 [WARNING] collected 350 metrics
Warning: Number of returned metrics is too high for instance: cassandra-localhost-7199. Please read http://docs.datadoghq.com/integrations/java/ or get in touch with Datadog Support for more details. Truncating to 350 metrics.
- Collected 350 metrics, 0 events & 0 service checks
Emitters
========
- http_emitter [OK]
====================
Dogstatsd (v 5.22.3)
====================
Status date: 2018-05-15 05:16:41 (9s ago)
Pid: 6093
Platform: Linux-4.14.32-coreos-x86_64-with
Python Version: 2.7.14, 64bit
Logs: <stderr>, /opt/datadog-agent/logs/dogstatsd.log
Flush count: 147
Packet Count: 64202
Packets per second: 56.6
Metric count: 420
Event count: 0
Service check count: 0
====================
Forwarder (v 5.22.3)
====================
Status date: 2018-05-15 05:16:51 (0s ago)
Pid: 6094
Platform: Linux-4.14.32-coreos-x86_64-with
Python Version: 2.7.14, 64bit
Logs: <stderr>, /opt/datadog-agent/logs/forwarder.log
Queue Size: 7704 bytes
Queue Length: 3
Flush Count: 498
Transactions received: 404
Transactions flushed: 401
Transactions rejected: 0
API Key Status: API Key is valid
Additional environment details (Operating System, Cloud provider, etc):
CoreOS 1688.5.3 running on AWS
Steps to reproduce the issue:
- Build a docker image based off the latest alpine image with the following
network.yaml
check configuration built-in:
init_config:
instances:
- collect_connection_state: true
excluded_interfaces:
- lo
- lo0
- docker0
# Ignore Docker's virtual interfaces:
excluded_interface_re: veth*
- Run the datadog agent container with the following mounts:
-v /var/run/docker.sock:/var/run/docker.sock -v /proc/:/host/proc/:ro -v /sys/fs/cgroup/:/host/sys/fs/cgroup:ro -v /etc/passwd:/etc/passwd:ro
- Run
docker exec -it datadog /opt/datadog-agent/bin/agent info
Describe the results you received:
The network check fails to capture some host network metrics.
Describe the results you expected:
The network check should work as it did in previous versions of the agent.
Additional information you deem important (e.g. issue happens only occasionally):
This is the same issue as #1131. I don’t believe the solution provided in that issue was correct. The problem occurs due to a combination of issues.
First, the solution in #1131 suggested setting the procfs_path
in process.yaml
. However, not only is that deprecated, it won’t actually work. The check will ignore any value of procfs_path
that is different from the agent config.
The suggested solution also mentions overriding procfs_path
for the network
check. However, the network
check does not read a procfs_path
from its init_conf
. It only honors the procfs_path
from the agent config.
Also, since the procfs_path
is now an agent-wide setting, it seems problematic to override it to a value that only fixes one check. Instead, the warning should be ignorable when host networking is used in a container.
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (5 by maintainers)
Top GitHub Comments
Using the following docker-compose.yaml and Dockerfile I got, afaik, correct data for the system.net.tcp4.[opening|listening|established|…] metrices.
The main key point here is that the environment variable
DD_PROCFS_PATH
is set to/proc
which is by default/host/proc
in a dockerized environment, the network_mode is set tohost
and that the relevant tools like ss and netstat are installed. The last one is the reason I needed a custom Dockerfile. The package iproute2 installs ss and net-tools installs netstat. Without the Dockerfile or the (ss and netstat) command, I got the errorError collecting connection stats.
(see https://github.com/DataDog/integrations-core/blob/master/network/datadog_checks/network/network.py#L363).Apart from the needed Dockerfile the config is similar to those mentioned in #1131.
Dockerfile:
docker-compose.yaml:
I’m not quite sure if network.d/conf.yaml and/or process.d/conf.yaml is relevant, but here is my config: conf.d/network.d/conf.yaml
conf.d/process.d/conf.yaml
This issue has been automatically marked as stale because it has not had activity in the last 30 days. Note that the issue will not be automatically closed, but this notification will remind us to investigate why there’s been inactivity. Thank you for participating in the Datadog open source community.