question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Supporting rasterize options via a compositor

See original GitHub issue

A long-standing issue has been how options used with regular plotting don’t transfer to the output from the datashader operations. It has been an important design principle to keep operations independent of the option machinery, especially since that style options are backend dependent.

However, with improvements in Bokeh (and to some extent matplotlib), we hope to be able to avoid the use of datashade and use rasterize instead. In particular, we now have eq_hist support on the plotting/client side and soon Bokeh will support categorical color mapping as well. Datashader now supports line thicknesses for Curve/Path elements, helping close the gap further. This improved rasterize approach works better with HoloViews than datashade so this means this is a good time to revisit this problem.

We can keep the basic datashader operations independent of the options system (leaving them as they are now) by making use of holoviews compositors (a powerful system that we use for statistical elements but haven’t really discussed or documented much). Compositors allow us to map option systems settings to operation parameters during the plotting process, allowing us to substitute the element the user passes to the display machinery with the output of the compositor operation instead. To make this work, we would add new datashader operations that should not be used directly by users but are purely about the display process.

This means we could support rasterize as follows:

hv.Curve([1,2,3]).opts(rasterize=True)

The presence of rasterize=True would enable the compositor operation with would grab all the necessary style/plot options, run the regular rasterize operation with those options and display the result.

We could map certain options directly. Off the top of my head, common ones are color, line_width and cmap . So you could use:

hv.Curve([1,2,3]).opts(rasterize=True, color='red', line_width=4)

So far, the only new options is rasterize and while I suspect rasterize might be a matplotlib options somewhere, I don’t think it is one that is important to expose anywhere (something to do with SVG rendering?).

There are always going to be options that only make sense at the datashader level (e.g. aggregator, x_sampling/y_sampling) and for this, I propose that rasterize can also be a dictionary with options aimed at datashader:

hv.Curve([1,2,3]).opts(rasterize=dict(aggregator=ds.mean('dim')), color='red', line_width=4)

This dictionary would be an override so it would take precedence for any options that would normally be automatically-mapped to datashader. The hope would be that this dictionary is only needed when more control is needed and that the usual style options are sufficient for most common cases.

We can consider adding another option to this dictionary, sample_limit which would state the sample limit (e.g rows if tabular) at which point datashader rasterization is enabled. I’m not convinced we should implement more than this at the HoloViews level (and even if we do expose this more, the default should always be to have datashading off!) but hvplot could expose an easy API to specify the level/heuristic used to automatically enable/disable datashader output.

Open questions

  • The appropriate default aggregator probably needs to switch based on element type (e.g. NdOverlays to categorical?), among other things.
  • What is the full set of options that can be mapped? Are they semantically compatible with the normal options?
  • How would hover work? Typically, you would get counts and you wouldn’t get access to all the available dimensions (but recent work means you might get some dimension values out)

Issue Analytics

  • State:open
  • Created 10 months ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
jlstevenscommented, Nov 14, 2022

Talking with @maximlt I remembered that we would need to use spread/dynspread for point sizes, which I don’t particularly like. It is fine if it works well enough though!

0reactions
jbednarcommented, Nov 14, 2022

I think a dynamic limit is what people would normally want, if they are coming from a perspective of really wanting a normal Bokeh plot but falling back to Datashader when things get too big, which I think is the usual motivation for rasterize=50000. There may also be people who want to select datashader exclusively for a dataset if it’s over a certain size, but I’d rather not even have to document that option because it’s confusing and hard for people to grasp the difference, so I wouldn’t even implement it unless we had a really clear use case and motivation. So my vote is for dynamic only.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Chromium Docs - How cc Works
cc is responsible for taking painted inputs from its embedder, figuring out where and if they appear on screen, rasterizing and decoding and...
Read more >
Multithreaded Rasterization - The Chromium Projects
Our current compositor thread architecture is built around the idea of rasterizing layers on the main webkit thread and then, on the compositor...
Read more >
Multi-threaded Rasterization in the Chromium Compositor
Main thread handles javascript, style recalc, layout, decoding images, and rastering content. ​. We have different notions of frame numbers. A compositor frame ......
Read more >
Volume Rasterize Attributes geometry node - SideFX
The Volume Rasterize Attributes SOP takes a cloud of points as input and creates VDBs for its float or vector attributes. Internally, this...
Read more >
Wayland FAQ
Is Wayland network transparent / does it support remote rendering? ... via a number of compositors: Weston itself as well as Enlightenment, GNOME...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found