[envoy integration] Metrics missing
See original GitHub issueNote: If you have a feature request, you should contact support so the request can be properly tracked.
Output of the info page
root@datadog-cluster-agent-69bc84c5c-rrkch:/# datadog-cluster-agent status
Getting the status from the agent.
2022-09-02 07:00:49 UTC | CLUSTER | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
===============================
Datadog Cluster Agent (v1.22.0)
===============================
Status date: 2022-09-02 07:00:49.867 UTC (1662102049867)
Agent start: 2022-08-30 08:45:54.797 UTC (1661849154797)
Pid: 1
Go Version: go1.17.11
Build arch: amd64
Agent flavor: cluster_agent
Check Runners: 4
Log Level: WARN
Paths
=====
Config File: /etc/datadog-agent/datadog-cluster.yaml
conf.d: /etc/datadog-agent/conf.d
Clocks
======
System time: 2022-09-02 07:00:49.867 UTC (1662102049867)
Hostnames
=========
ec2-hostname: ****
host_aliases: [***]
hostname: ****
instance-id: ***
socket-fqdn: datadog-cluster-agent-69bc84c5c-rrkch
socket-hostname: datadog-cluster-agent-69bc84c5c-rrkch
hostname provider: container
unused hostname providers:
aws: Unable to determine hostname from EC2: Get "http://169.254.169.254/latest/meta-data/instance-id": dial tcp 169.254.169.254:80: connect: connection refused
azure: azure_hostname_style is set to 'os'
configuration/environment: hostname is empty
gce: unable to retrieve hostname from GCE: GCE metadata API error: Get "http://169.254.169.254/computeMetadata/v1/instance/hostname": dial tcp 169.254.169.254:80: connect: connection refused
Metadata
========
Leader Election
===============
Leader Election Status: Running
Leader Name is: datadog-cluster-agent-69bc84c5c-r6r98
Last Acquisition of the lease: Fri, 26 Aug 2022 14:02:50 UTC
Renewed leadership: Fri, 02 Sep 2022 07:00:41 UTC
Number of leader transitions: 13 transitions
Custom Metrics Server
=====================
Data sources
------------
URL: https://api.datadoghq.com
ConfigMap name: default/datadog-custom-metrics
External Metrics
----------------
Total: 0
Valid: 0
Cluster Checks Dispatching
==========================
Status: Follower, redirecting to leader at 10.42.224.6
Admission Controller
====================
Webhooks info
-------------
MutatingWebhookConfigurations name: datadog-webhook
Created at: 2022-06-01T07:04:25Z
---------
Name: datadog.webhook.config
CA bundle digest: 4a037a372da419e0
Object selector: &LabelSelector{MatchLabels:map[string]string{},MatchExpressions:[]LabelSelectorRequirement{LabelSelectorRequirement{Key:admission.datadoghq.com/enabled,Operator:NotIn,Values:[false],},},}
Rule 1: Operations: [CREATE] - APIGroups: [] - APIVersions: [v1] - Resources: [pods]
Service: default/datadog-cluster-agent-admission-controller - Port: 443 - Path: /injectconfig
---------
Name: datadog.webhook.tags
CA bundle digest: 4a037a372da419e0
Object selector: &LabelSelector{MatchLabels:map[string]string{},MatchExpressions:[]LabelSelectorRequirement{LabelSelectorRequirement{Key:admission.datadoghq.com/enabled,Operator:NotIn,Values:[false],},},}
Rule 1: Operations: [CREATE] - APIGroups: [] - APIVersions: [v1] - Resources: [pods]
Service: default/datadog-cluster-agent-admission-controller - Port: 443 - Path: /injecttags
Secret info
-----------
Secret name: webhook-certificate
Secret namespace: default
Created at: 2022-06-01T07:04:25Z
CA bundle digest: 4a037a372da419e0
Duration before certificate expiration: 6528h3m34.106622362s
=========
Collector
=========
Running Checks
==============
kubernetes_apiserver
--------------------
Instance ID: kubernetes_apiserver [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/kubernetes_apiserver.d/conf.yaml.default
Total Runs: 16,860
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-09-02 07:00:42 UTC (1662102042000)
Last Successful Execution Date : 2022-09-02 07:00:42 UTC (1662102042000)
orchestrator
------------
Instance ID: orchestrator:*** [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/orchestrator.d/conf.yaml.default
Total Runs: 25,290
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2022-09-02 07:00:47 UTC (1662102047000)
Last Successful Execution Date : 2022-09-02 07:00:47 UTC (1662102047000)
=========
Forwarder
=========
Transactions
============
Cluster: 0
ClusterRole: 0
ClusterRoleBinding: 0
CronJob: 0
DaemonSet: 0
Deployment: 0
Dropped: 0
HighPriorityQueueFull: 0
Ingress: 0
Job: 0
Node: 0
PersistentVolume: 0
PersistentVolumeClaim: 0
Pod: 0
ReplicaSet: 0
Requeued: 300
Retried: 94
RetryQueueSize: 0
Role: 0
RoleBinding: 0
Service: 0
ServiceAccount: 0
StatefulSet: 0
Transaction Successes
=====================
Total number: 33719
Successes By Endpoint:
check_run_v1: 16,859
intake: 1
series_v1: 16,859
Transaction Errors
==================
Total number: 11
Errors By Type:
DNSErrors: 11
On-disk storage
===============
On-disk storage is disabled. Configure `forwarder_storage_max_size_in_bytes` to enable it.
==========
Endpoints
==========
https://app.datadoghq.com - API Key ending with:
- 1f056
=====================
Orchestrator Explorer
=====================
Collection Status: Clusterchecks are activated but still warming up, the collection could be running on CLC Runners. To verify that we need the clusterchecks to be warmed up.
Cluster Name: ***
Cluster ID: ****
Container scrubbing: enabled
======================
Orchestrator Endpoints
======================
https://orchestrator.datadoghq.com - API Key ending with: *****
Status: Follower, cluster agent leader is: datadog-cluster-agent-69bc84c5c-r6r98
Additional environment details (Operating System, Cloud provider, etc): There is a support case 901101 but didn’t make much progress
Steps to reproduce the issue:
- I have istio installed in my cluster and I need some metrics from envoy level hence I configured below on the app pods to scrape the envoy metrics.
ad.datadoghq.com/istio-proxy.check_names: '["envoy"]'
ad.datadoghq.com/istio-proxy.init_configs: '[{}]'
ad.datadoghq.com/istio-proxy.instances: |
[
{
"openmetrics_endpoint": "http://%%host%%:15090/stats/prometheus",
"histogram_buckets_as_distributions": "true",
"log_requests": "true",
"extra_metrics":
[
{
"envoy_cluster_upstream_rq_time":
{
"name": "cluster.upstream_rq_time"
"type": "histogram"
}
}
]
}
]
- send some traffic from one pod to the other. From the metrics endpoint and prometheus
Describe the results you received: I could find these metrics but in datadog explorer, I could not find them. Except for the 1st one, others are included in your metrics dict
- cluster.upstream_rq_time
- cluster.upstream_cx_rx_bytes_total
- cluster.upstream_cx_tx_bytes_total
- listener.downstream_cx_length_ms
- cluster.upstream_rq_xx (raw metrics are with specific status code. I’m guess the agent will parse it?)
- some metrics has the data but different from the raw metrics or prometheus scrapes. did datadog/query did some aggregation in the metrics explorer?
- the support requested me to add ‘status_url’ but I guess it won’t work for v2 integration?
- some metrics ‘type’ are different from the type exposed from the pod. like the ‘counter’ is converted to ‘rate’. Is this expected or somewhere has misconfiguration Describe the results you expected: scrape those metrics
Additional information you deem important (e.g. issue happens only occasionally):
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:11 (5 by maintainers)
Top Results From Across the Web
Envoy - Datadog Docs
If you are using Envoy as part of Istio, configure the Envoy integration to collect metrics from the Istio proxy metrics endpoint.
Read more >Missing Envoy metrics - - Bountysource
Missing Envoy metrics. ... A number of statistics from Envoyproxy are not exposed in Datadog. Amongst others these include the following:.
Read more >Observability for the Missing Hop: Envoy Mobile
Envoy has a best-in-class, comprehensive suite of observability data points. From distributed tracing, to logging, to time-series metrics, ...
Read more >Envoy Proxy Integration | Tanzu Observability Documentation
Envoy Proxy: This integration installs and configures Telegraf to send Envoy Proxy metrics into Wavefront. Telegraf is a light-weight server process capable ...
Read more >Ingest metrics from Envoy
The OpenTelemetry Collector, when configured with a Prometheus receiver, provides an integration with Envoy to ingest metrics.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hey @burningalchemist , unfortunately we don’t have any updates on this.
@yzhan289 having the same issue I bet
extra_metrics
field is non-effective. I believeenvoy_cluster_upstream_rq_time
is important to have as a part of the integration to balance the existingenvoy.http.downstream_rq_time
while staying withopenmetrics_endpoint
. Would you mind reopening the issue?@Shuanglu in the meantime did you find a solution?