First pass at Web UI
See original GitHub issueUsers can better diagnose issues and performance if we expose the cluster state to the user. One important component of this is a web UI.
This UI would help users to answer questions like the following:
- At a glance, what is the state of my network? How many workers do I have and what is their state?
- For each of these workers, tell me the following information:
- What are they working on?
- How much memory do they have? how much is free? What kinds of data are taking up space?
- How many cores do they have? What are they working on?
This might all fit on the same page or there might be several pages for different views.
Information Approaches
The machine hosting the scheduler will likely also host the http server for the web UI. This HTTP server can either be a separate process or it can be a Tornado application in the same process running on the same event loop.
If we separate the two servers into separate processes, then the Web UI should probably query the scheduler over the JSON interface already hosted by the scheduler. This separation is nice because the scheduler is resource constrained, so we’re a bit picky about any serious application taking up CPU resources.
If we join the two servers into one process running on the same event loop then the web UI can directly query the Python state of the scheduler. This reduces the need to make JSON routes on the scheduler side but does force whoever makes the web UI to understand the scheduler a bit more. This also allows for immediate push notification of certain events.
Personally I’m happy to support either. The separate processes depending on JSON approach seems more appropriate to start with but, if we get very serious about this we may want to switch. I largely like the JSON approach because it reduces the number of things anyone needs to know and keeps everyone in their comfort zone.
Visualization Approaches
Currently we have nothing, or, in some cases, JSON. Really anything would be better than this. This might include simple static web pages that we refresh or it might include full Bokeh applications.
I’m ignorant here so I’m looking for other people who can take on these decisions.
Getting started
There are a few simple JSON routes listed at the bottom of these pages:
- https://github.com/dask/distributed/blob/master/distributed/http/scheduler.py
- https://github.com/dask/distributed/blob/master/distributed/http/worker.py
When you start up a scheduler locally it lists from where these routes are served:
mrocklin@workstation:~$ dscheduler
distributed.scheduler - INFO - Start Scheduler at: 192.168.1.141:8786
distributed.scheduler - INFO - http at: 192.168.1.141:37084 # <----- look here
mrocklin@workstation:~$ curl 192.168.1.141:37084/info.json
{"ncores": {}, "status": "running"}
For me, ideally, web development experiments starts in separate repositories that we eventually decide to merge in.
Issue Analytics
- State:
- Created 8 years ago
- Comments:6 (5 by maintainers)
Top GitHub Comments
First pass completed. Closing.
I started playing with this over here: #161
I suspect that it will take several iterations before something presentable exists though.