expose transfer bytes total (count) rather than just gauge of current value
See original GitHub issueCurrently the worker Prometheus endpoint exposes transfer_incoming_bytes
and transfer_outgoing_bytes
as gauges—i.e., the current value at a single point in time.
A better way to expose this sort of data is as a monotonically increasing count metric type (this should be exposed as transfer_incoming_bytes_total
and transfer_outgoing_bytes_total
).
It’s easy to get rate from an accumulated count, but you can’t get accurate count from a sampled rate.
Issue Analytics
- State:
- Created a year ago
- Comments:6 (4 by maintainers)
Top Results From Across the Web
expose transfer bytes total (count) rather than just gauge of current ...
Currently the worker Prometheus endpoint exposes transfer_incoming_bytes and transfer_outgoing_bytes as gauges—i.e., the current value at a single point in time ...
Read more >Cloud SQL metrics | Cloud SQL for PostgreSQL - Google Cloud
Total RAM usage in bytes. This metric reports the RAM usage of the database process, including the buffer/cache. Sampled every 60 seconds. After...
Read more >Interpreting Prometheus metrics for Linux disk I/O utilization
This interprets the same underlying diskstats , and it's enlightening to see how it does so. The first set of stats you'll see...
Read more >Runtime metrics | Docker Documentation
Reads and writes are merged in a single counter. Indicates the number of bytes read and written by the cgroup. It has 4...
Read more >Visualizing observability with Kibana: Event rates and ... - Elastic
A gauge is a snapshot in time of a value, it goes up and it goes down, ... it increments the counter with...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I think we don’t expose actually bytes transferred over time, i.e., what @crusaderky calls “cumulative” values in https://github.com/dask/distributed/pull/6936#issuecomment-1230524443
That’s what I was asking for. If it’s not high-value, feel free to ignore for now though.
For context, host metrics can tell us how much data moves in/out of each worker. What it can’t exactly tell us (at least not easily) is how much of that is transfer vs data moving into/out of cluster (e.g., S3). I think it would be nice if Dask could tell us how many bytes of host network traffic is for transfer.
I would also find this useful for benchmarking. Total amount of data transferred is a useful metric to compare when working on changes to scheduling.