add _real_ stacked area charts [feature request]
See original GitHub issueI’m filing this issue as a gathering point for the feature request of real stacked area charts.
The current solution to create stacked area charts is to plot cumulative variables which has multiple drawbacks:
- Correct (i. e. non-cumulative) hover values have to be set manually by using
text: '...'
. - The resulting chart is incorrect if some traces are shown and others hidden (by clicking on legend items).
A real solution would be to have an argument like layout = {linemode: 'stack'}
, same as there is for bar charts.
I’m aware that @etpinard stated some time ago:
We are planning on adding area charts to our list of trace types at some point this year. You are not the first person to ask for this feature.
Therefore I hope opening this issue won’t be seen as an annoying harassment! Since it is an often requested feature, I think it important to have a place for users like me to gather all relevant information and potential progress on this.
Some more information:
- relevant bug reports:
- Generating a stacked area chart in
ggplot2
and then converting it to plotly usingggplotly
doesn’t work as supposed either (the plot doesn’t get rescaled correctly if traces in between are unselected, there will just be space left). - proposed workaround by @etpinard (in
JS
): https://codepen.io/etpinard/pen/yOgdOb
Issue Analytics
- State:
- Created 7 years ago
- Reactions:5
- Comments:31 (29 by maintainers)
Top GitHub Comments
Closed by #2960
I don’t think there’s actually a difference between the two: Excel isn’t drawing lines at all, only fills, so there’s nothing to omit, but presumably you can turn lines on and then I bet they would be drawn the same way as Google’s.
As @nicolaskruchten alludes to, the question of what to do with mismatched
x
values is the key sticking point here, the reason adding stacked area charts is not as easy for us as adding stacked bars.Google sheets (and Excel, at least by default, I haven’t looked in detail) has a simpler data model than we do: every series shares the same
x
data, so it’s not possible to have mismatchedx
, the most you can do is have emptyy
at certainx
values. They seem to treat those empties as zeros. That’s certainly a plausible interpretation for certain data anyway, but not all, and it differs from how we handle scatter (line) in other contexts - where a missingy
(orx
for that matter) either leaves a gap or gets the line drawn straight from one valid point to the next, depending on theconnectgaps
setting.Seems to me when we stack area charts we can internally fill in missing x values across all the stacked traces, and then there are perhaps three ways you might want to interpret gaps:
connectgaps: true
, and would make sense in cases of incomplete data, for example you’re summing populations across different countries by year, you don’t have data for every country for every year but interpolating is a good assumption (certainly better than guessing zero population in those years!). This could get tricky if we want to takeline.shape
into account and try to make the (first) stacked trace look identical to its unstacked alternative. That’s probably not necessary though, at least to begin with we can interpolate linearly (which is also probably about the only interpolation option that preserves the total, independent of stacking order). But I think it probably is important to not display markers at the interpolated points.connectgaps: false
. We could do complicated things with stacking gap-less traces on top of gapped traces, but since the point here is to not make any assumptions that aren’t explicit in the data, probably we’d want all gaps to propagate upward to the top of the stack.So if the second and third cases are covered by
connectgaps
, what about the first (which, to fit with Google & Excel, should be the default)? I suppose it could be a newconnectgaps: 'zero'
or something? There would also be an argument for making this a separate setting, so thatx
values with an invalidy
(''
,null
, non-numeric) would be treated differently fromx
values that get inserted just because they’re present in other data sets. Perhaps you’d like newly-insertedx
values to gety=0
but invalidy
to be treated as a gap?I guess I can imagine cases where that would be the “most correct” way to display the data, though it might be more complexity than users really want. On the other hand making a new setting for this would allow us to avoid turning
connectgaps
into another “boolean plus a string” enumerated attribute, as well as avoiding extra logic around its default value. And mostly people would just use the default value of this new setting. So what could this new attribute be? How aboutstackgaps: ('zero' (dflt)|'gap'|'interpolate')
?