Parcoords multiselection
See original GitHub issueCurrently, parcoords
permits the selection of one contiguous range within each dimension (via brushing, or via an API call). We’re working on support for multiselections, ie. an OR relation among multiple ranges within a specific dimension. Also, any of the 60+ dimensions could have multiple selections. The goal is to retain as much speed as feasible.
Currently, a promising route for the multi-selection is the use of a pixel-based bitmap (likely via textures) for each dimension. Basically, instead of arbitrary floating-point(*) ranges, filtering would have no higher resolution than what can be selected on the screen, or discerned as lines, ie. the pixel raster. A mechanical interpretation is that brushing would turn pixels on
/off
, enabling or disabling lines that go through that pixel. This is sufficient for mouse based interactions, but in theory(**), a floating-point range can be more accurate, allowing subpixel selection via the API.
(*) (**) the 32 bit floating points are effectively 23 bit integers in this regard, which is quite a bit more than the (for example) 12 bit precision (=4096) that characterizes a 4k tall parcoords
, or the 8-10 bit precision on views of regular height eg. on a dashboard. On the other hand, parcoords
is an aggregate view where going more accurate than a pixel limit may not be useful even through applying the constraints via the API. So this approach is workable iff the visible raster is deemed sufficient as constraint resolution via the API. (It’s not possible to go subpixel level with the mouse, unless with browser zooming or similar.)
The motive is that the vertical height of the parcoords is bounded by reasonable monitor resolutions, and WebGL gl.MAX_TEXTURE_SIZE
is at least 4096, and gl.MAX_TEXTURE_IMAGE_UNITS
is at least 8, and each texture pixel can be up to 4 bytes (64 bits) so even in the worst case, there’s plenty of room for a full-height 4k monitor display and our almost 64 dimensions. The resulting vertex shader algo is a simple texture lookup, and given that 64 bits make up the texture pixels and we have almost 64 dimensions, it could reduce to simple bit fiddling operations.
There are alternatives, but as this approach is promising and would have fairly bounded runtime (to be tested), and represents no limit on how many ranges can be selected, we should discuss if this is an acceptable solution, before enlisting more involved alternatives.
Issue Analytics
- State:
- Created 6 years ago
- Comments:9 (6 by maintainers)
Top GitHub Comments
@phxnsharp thanks for the added info. We’ve been working on the pixel mask on the GPU on which things like lookup masks (conveniently, at pixel pitch, but can be sparser or denser up to 4k evenly spaced bins) are fairly straightforward and are of O(1) complexity, but relation joins aren’t.
What I mean by joins is that the requirement is analogous to joining a data relation (table with possibly 10k or more rows per dimension, up to ~60 dimensions) with a filter relation (up to 1000 values ie. records per dimension), and performing such joins requires operations (merge join, hash join B-tree index lookup) that aren’t a good fit for WebGL - such operations are sometimes done in a GPGPU setting with CUDA or OpenCL, but WebGL has no compute shaders so we’re resorted to using vertex and pixel shaders. For example, in WebGL even a
for
loop where the iteration count is not known in advance is tricky, because the iteration count needs to be known at compile time, and setting a too high value for the worst case means that its performance will always reflect that.We’ll do some brainstorming though to see how it could be solved, given the specifics that only ordinal variables need to be filtered, and with up to 1000 values per dimension. A promising direction is to bijectively map the categorical values of possibly uneven cadence [x0, …, xN] to integers [0, …, N], and bring back the “distortion” via the inverse mapping (via a new lookup table) on the GPU for rendering. The benefit is that the filter could use the integer grid, so another lookup table (bitmap) could be indexed into to check if the value is selected. While it looks feasible, it’s computationally more involved than the pixel based approach because it’s two more lookups than what we have now and we may need to do it for all dimensions, categorical or not. We may have to bake in the ~1000 (eg. 1024) limit for categorical values which doesn’t sound like a big loss. We’ll check if there are enough GPU resources for this approach to run on most or all computers to avoid relying on resource limits such as allowed texture count which aren’t well supported by either the standard or the overwhelming majority of hardware makers.
Keeping both modes is possible: w’d preserve what we have now and add the raster based filtering in an AND relationship. We might be able to even avoid a texture lookup for those dimensions that don’t need it, although the GPU is often faster if we just let it do superfluous parallel work if it helps avoid conditional branching. If the bitmap raster is fast enough on its own, then it’ll be fast enough when intersecting it with what we have now. All this (multidimensional) point selection logic is happening in the vertex shader, and our bottleneck is the fragment shader anyway.