istio.mesh.request.count metric should be a monotonic count
See original GitHub issueOutput of the info page
===============
Agent (v6.11.1)
===============
Status date: 2019-05-07 16:50:41.559676 UTC
Agent start: 2019-05-07 00:55:41.341177 UTC
Pid: 339
Python Version: 2.7.16
Check Runners: 4
Log Level: info
...
istio (2.1.0)
-------------
Instance ID: istio:c630785e1593291 [OK]
Total Runs: 3,839
Metric Samples: Last Run: 148, Total: 568,172
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 51ms
Additional environment details (Operating System, Cloud provider, etc):
- Kubernetes version 1.11.7
- Running on AWS via
kops
- Istio version 1.1.4, installed via Helm chart
The istio-telemetry
pod has the following annotations:
ad.datadoghq.com/mixer.check_names: '["istio"]'
ad.datadoghq.com/mixer.init_configs: '[{}]'
ad.datadoghq.com/mixer.instances: |
[
{
"istio_mesh_endpoint": "http://%%host%%:42422/metrics",
"mixer_endpoint": "http://%%host%%:15014/metrics",
"send_histograms_buckets": true
}
]
Steps to reproduce the issue:
My kubernetes cluster running Istio has very little activity on it. So the only requests happening regularly are health checks for the running services.
However, when I look at the istio.mesh.request.count
metric in Datadog for that cluster, it is steadily increasing at a constant rate:
Describe the results you expected:
Since the traffic is constant, I expect that metric to be a flat horizontal line.
I believe this is because Istio exposes that metric in the Prometheus/OpenMetrics format, where counts are constantly increasing. The datadog agent should take that into account and subtract the Nth value from the (N+1)th value.
I’ve seen errors like this before, like in #1303, which I opened about a similar kube-dns metric.
See also #3121, which brings up the same issue, but also involved how to configure the Istio check when running as a Deamonset.
Issue Analytics
- State:
- Created 4 years ago
- Comments:15 (3 by maintainers)
Top GitHub Comments
Ah, I think I found a workaround.
I set the
send_monotonic_counter
flag totrue
in my annotations, and now it looks more like what I would expect:I discovered that while looking through the istio check code:
https://github.com/DataDog/integrations-core/blob/bdf9216ba605ac7a60d1047cf8afc381d0fa1208/istio/datadog_checks/istio/istio.py#L163
You might want to change the default, or at the very least highlight that bit of configuration in your documentation. I think it’s way more intuitive that
*.count
metrics should be counts instead of gauges.This issue has been automatically marked as stale because it has not had activity in the last 30 days. Note that this will not be automatically closed, but the notification will remind us to investigate why there’s been inactivity.
If you would like this issue to remain open:
Thank you for participating in the Datadog open source community!