Agent v6: kube-dns metrics are constantly increasing despite constant traffic
See original GitHub issueI recently deployed v6.1.0 of the datadog agent to one of my Kubernetes clusters. I added the annotations to my instances of kube-dns so that the agent would collect information from them. I’ve let that run for the better part of two days.
When I look at the metrics for kubedns.request_count
, it appears that the number of DNS requests has been increasing steadily over the past 2 days. When I know for a fact the traffic and activity on the cluster has been steady. (It’s the weekend, so not much is going on.)
I know that kube-dns exports its metrics in the prometheus format, so that counter metrics are always increasing. (See metrics formats.) But it’s supposed to be the job of the scraper to take the difference between the values at T(n)
and T(n-1)
to calculate the count of events in that interval.
My assumption is that the prometheus scraper in datadog is supposed to do that subtraction. So I’d expect the graph above to be a flat line with a slope of 0, rather than a steadily increasing line.
Datadog Agent Version: 6.1.0 KubeDNS based on: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.5 Kubernetes cluster version: 1.5.7 (though I don’t think that should affect the prometheus scraping)
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (7 by maintainers)
Top GitHub Comments
Hello, @antoinepouille
It looks like
kubedns.requests_duration.seconds.count
still returns accumulated values, not the rate. Same happens forkubedns.requests_duration.seconds.sum
, but values are different. How to correctly setup a timeseries visualization forrequests_duration
?@jonmoter Thanks for pulling this out from the code. I could sync up with an engineer from the metrics team to figure this out: this matter is actually causing some issues when using the metrics, so I think we can actually create a second metric that would use the
monotonic_count:
function to directly forward the counts instead of the current raw value. I will forward this to the team so that we add some work to tackle this, stay tuned.