question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`kubernetes.cpu.usage.total` is reported in nanocores

See original GitHub issue

The Kubernetes integraton (https://github.com/DataDog/integrations-core/tree/master/kubernetes) provides CPU stats in two different units:

  • cpus (for kubernetes.cpu.limits and kubernetes.cpu.requests)
  • percent_nano (for kubernetes.cpu.usage.total)

These metrics are quite useful to put on the same graph in order to tune the requests/limits settings for Kubernetes containers. To do that, you have to scale kubernetes.cpu.usage.total by 1000000000 (one billion), because the value being reported by the integration is in units of nano CPUs (ie, percent_nano, though there doesn’t appear to be any percent math applying. Somewhat annoyingly, you have to specify 1000000000 as (1000000 * 1000) because the web UI will otherwise convert it to 1e9 in the JSON version of a graph, but 1e9 isn’t actually a valid value to scale a metric by.

The values are scraped from CAdvisor, who defines them here: https://github.com/google/cadvisor/blob/e14ee9be3506d260847d263e26a3e9e27f83ad96/info/v1/container.go#L267-L283, being pulled from the Docker daemon’s statistics (https://github.com/moby/moby/blob/8b1adf55c2af329a4334f21d9444d6a169000c81/daemon/stats/collector_unix.go#L27-L71). It’s essentially a rate (per second) of nanoseconds of CPU time used (thus, nanocpu).

I would like to have the values of kubernetes.cpu.usage.total appear as just “cpus” for use in DataDog. Right off the bat, two solutions jump out at me:

  • scale the kubernetes.cpu.usage.total metric by a billion at the agent level, and change its unit from percent_nano to cpu - I’ve made a PR for this (link coming shortly)
  • make a new unit, nanocores, that will accept the current values being sent, but display them as cores (similar to how it seems that byte-based metrics work)
  • maybe other folks have other ideas?

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:12
  • Comments:39 (9 by maintainers)

github_iconTop GitHub Comments

13reactions
balusarakeshcommented, Feb 1, 2021

is there a solution for the above problem from @msvechla? suggestion from @ryanooi doesn’t seem to work, we are looking for a way to get a pod’s CPU utilization Screen Shot 2021-01-21 at 9 58 24 AM

this is the correct function ( ( a / 1000000000 ) / b ) * 100

5reactions
msvechlacommented, Sep 20, 2018

@JulienBalestra what would be the best way to plot the percentage of CPU used now? I tried it with kubernetes.cpu.usage.total / kubernetes.cpu.capacity, however I am not getting the correct percentage numbers in the graph:

screen shot 2018-09-20 at 14 59 54
Read more comments on GitHub >

github_iconTop Results From Across the Web

Metric-server cpu and memory units - General Discussions
Kubernetes CPU metrics are generally “expressed” when a person deals with them in millicores/milliCPU, or 1/1000 of a cpu. a nanocore/nanoCPU is ...
Read more >
Kubernetes CPU in nanoseconds
If you have a running total of the number of (nano)seconds, you can look at the derivative to figure out percentages. Example:.
Read more >
Kubernetes Data Collected
Note: The set of metrics collected by the Datadog Kubernetes integration ... kubernetes.cpu.usage.total (gauge), The number of cores used. Shown as nanocore.
Read more >
Kubernetes fields | Metricbeat Reference [8.5]
CPU usage as a percentage of the defined limit for the pod containers (or total node CPU if one or more containers of...
Read more >
Kubernetes metrics for thresholds
These are the metrics that are available for use in Kubernetes Cluster thresholds: Cluster Name; CPU: Allocatable Nanocores, Capacity Nanocores, Usage Core ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found