Trimesh aggregation approach
See original GitHub issueThis issue proposes a methodology for how the “aggregation” feature could work, in the context of triangle mesh rasterization (as of PR #525, the agg
keyword-argument to Canvas.trimesh()
).
Background
While rasterizing a mesh, we interpolate the value of each pixel based on its position in each triangle, along with the “weight” (e.g. elevation) at each vertex. In this way, each pixel can be assigned a “sample” which serves as an approximation for what the measurement is likely to have been between three measured (vertex) locations. When the user zooms out of an image, some triangles may become too small to see - smaller than a pixel, for example. Although we can’t see them, however, their values should somehow still contribute to the pixel coloring. This is where the concept of “aggregation” comes in: when downsampling the image, we must decide how the “unseen measurements” get incorporated into each pixel’s final value, which ultimately determines the color during shading.
Aggregation methodology
Conceptually, datashader treats a triangle mesh as a continuous function: given a location - and perhaps sometime soon, a moment in time - as input, the function yields a single-valued output. For example, in the context of a digital elevation model, the function’s output represents the elevation at a particular location. “Aggregation” is analogous to taking a sliding calculation over this function; a “sum” aggregation is akin to taking the integral. However, when it comes to deciding on which values we should aggregate, we have a decision to make. Below I’ve attempted to illustrate this in a way that is easy to visualize:
Here every blue dot represents a measurement sample, and each integer between two ticks represents a “bin” (pixel). Each blue dot is connected by a line, representing the interpolation between measurements. If we are to decide what a “minimum” aggregation should look like, there are two obvious ways that we can choose: either we can select among only the measured values in a bin, or we can select among any of the measured values and interpolated values of the bin:
“min” aggregation, by only the measured values:
In the above approach, we’re technically “more accurate” because the aggregations are based on the raw data, however there’s an ambiguity when it comes to bin 2, where there is no data available: do we introduce a “hole” in the mesh, or do we extend the previous value?
“min” aggregation, by both measured values, and interpolated values:
In the second approach, we’re coloring each pixel based on a linearly-interpolated value, potentially overriding the existing measurements in that bin. However, for a sufficiently-dense mesh, interpolated results will be very close to the actual measurements, with neighboring measurements being factored into the calculation. Also, there is no ambiguity in the cases of a lack of measured data available. For these reasons, this second approach is the one I intend to implement - barring any unforeseen complications, or more logical approaches, of course.
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (3 by maintainers)
Top GitHub Comments
I think this captures the results of our discussion, thanks!
Understand, thank you.