Feature: Create a metrics view
See original GitHub issueWe want to display a metrics view/page in the IPFS desktop app and webui. Ideally, this view would allow users to view any/all metrics they need, without having to access the CLI or other tools. This view would not have tools or complicated features that would allow users to process those metrics. Instead, they would want to use the Diagnostics View.
Original issue description
@olizilla suggested we could add a metrics tab somewhere in the Web UI where we could show the data from
${apiAddr}/debug/metrics/prometheus
graphically. That would certainly be an interesting idea and useful for some kinds of users.Leaving this as a WIP Issue
Agreed upon metrics
These metrics are ones that we’ve decided we will implement.
Metric | Where do we get it | Requires change to Kubo/libp2p/etc? | Code sample | Notes |
---|
Possible metrics
These metrics have strong arguments and use-cases and need to be discussed to decide whether they are useful enough to surface
Metric | Where do we get it | Requires change to Kubo/libp2p/etc? | Code sample | Notes |
---|---|---|---|---|
downloadedSize | stats.bitswap | No | const { dataReceived } = await getIpfs().stats.bitswap(); |
Discussion at https://github.com/ipfs/ipfs-webui/pull/1942 |
sharedSize | stats.bitswap | No | const { dataSent } = await getIpfs().stats.bitswap(); |
Discussion at https://github.com/ipfs/ipfs-webui/pull/1942 |
dialable | kubo client/server mode | Yes | TBD | This metric is a surfacing of whether a node is in server/client (serving/leeching) mode. We should be able to infer this, but needs analysis. |
Disqualified metrics
These metrics will not be included in the metrics view for one reason or another.
Metric | Where do we get it | Code sample | Notes |
---|
Looking for community and IPFS Implementers’ feedback on this issue.
Here are some questions to help get the ideas flowing.
Which metrics should we focus on?
- What metrics do you currently use/view/monitor and how do you obtain them?
- Would it benefit you to see those metrics graphed in the Webui/desktop app?
- Which of those metrics are difficult to get access to or discover?
- i.e. Metrics that require comprehensive calculations, metrics that you seem to always forget how to obtain, metrics that require a complicated process to obtain, etc.
- What metrics do you desire but are not currently obtainable? Why do you need them (usecase)?
- Local node uptime/downtime - To monitor stability of my node because I need my home ipfs node to always be avail
- Local node activity (global reqs, responses, latency, etc…) - To monitor how much my node is used (just curious)?
- Total peers connected over time - To monitor the health of my ipfs network
- count and frequency of requests that timeout - To denylist/allowlist certain good/bad peers?
- etc…
What/Why/How
- What problems do you currently have that a specific metrics view in the webui and IPFS desktop could help you solve?
- Do you need customizable metrics views beyond selecting a time window?
- What kind of charts & graphs would benefit you most?
- Would you use the metrics view instead of your current tool of choice if the metrics were available in the IPFS desktop and webui?
Issue Analytics
- State:
- Created 4 years ago
- Reactions:3
- Comments:7 (7 by maintainers)
Top GitHub Comments
On my way to work this morning I started day-dreaming about how nice it would be to get back to working on developer tools for IPFS & co. A dangerous move as I was riding my bike at the time, but lo! What serendipitous timing! I’d love to pitch in on this!
That wall of text is the only format for getting that particular list of data points. It’s intended for consumption by Prometheus, the timeseries db, but it’s well standardised and much tooling exists for it. The numbers are all point-in-time measurements, intended to be scraped periodically, so it’s not such a bad fit for a local dev tool that is already in the habit of polling the api for info.
There are some useful metrics in there that we use all the time now, to check if a node is working well, things like bitswap queue length, but in general much of the metrics are super specific, and only really of use to a developer who is making specific changes to a specific subsystem, so we should build up a short list of metrics that are worth putting the spotlight on, and then give experts a way to get to the kitchen sink.
My quick take is:
count
, but having useful visualization ofhistogram
metrics makes it way easier to answer questions like:So in practice, we can’t hardcode or assume too much:
debug/metrics/prometheus
endpoint is available, and provide UI for exploring metrics listed there.TYPE
likecounter
andhistogram
and unique name and description defined inHELP
lines, we should be able to generate UI for each (if there is no JS tooling for this format already – needs research)