Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

add _real_ stacked area charts [feature request]

See original GitHub issue

I’m filing this issue as a gathering point for the feature request of real stacked area charts.

The current solution to create stacked area charts is to plot cumulative variables which has multiple drawbacks:

Correct (i. e. non-cumulative) hover values have to be set manually by using text: '...'.
The resulting chart is incorrect if some traces are shown and others hidden (by clicking on legend items).

A real solution would be to have an argument like layout = {linemode: 'stack'}, same as there is for bar charts.

I’m aware that @etpinard stated some time ago:

We are planning on adding area charts to our list of trace types at some point this year. You are not the first person to ask for this feature.

Therefore I hope opening this issue won’t be seen as an annoying harassment! Since it is an often requested feature, I think it important to have a place for users like me to gather all relevant information and potential progress on this.

Some more information:

relevant bug reports:
- plotly.js: ~~https://github.com/plotly/plotly.js/issues/344~~
- plotly for R: ~~https://github.com/ropensci/plotly/issues/686~~ and ~~https://github.com/ropensci/plotly/issues/810~~
Generating a stacked area chart in ggplot2 and then converting it to plotly using ggplotly doesn’t work as supposed either (the plot doesn’t get rescaled correctly if traces in between are unselected, there will just be space left).
proposed workaround by @etpinard (in JS): https://codepen.io/etpinard/pen/yOgdOb

Issue Analytics

State:
Created 7 years ago
Reactions:5
Comments:31 (29 by maintainers)

Top GitHub Comments

3reactions

alexcjohnsoncommented, Sep 7, 2018

Closed by #2960

2reactions

alexcjohnsoncommented, May 31, 2018

I don’t think there’s actually a difference between the two: Excel isn’t drawing lines at all, only fills, so there’s nothing to omit, but presumably you can turn lines on and then I bet they would be drawn the same way as Google’s.

As @nicolaskruchten alludes to, the question of what to do with mismatched x values is the key sticking point here, the reason adding stacked area charts is not as easy for us as adding stacked bars.

Google sheets (and Excel, at least by default, I haven’t looked in detail) has a simpler data model than we do: every series shares the same x data, so it’s not possible to have mismatched x, the most you can do is have empty y at certain x values. They seem to treat those empties as zeros. That’s certainly a plausible interpretation for certain data anyway, but not all, and it differs from how we handle scatter (line) in other contexts - where a missing y (or x for that matter) either leaves a gap or gets the line drawn straight from one valid point to the next, depending on the connectgaps setting.

Seems to me when we stack area charts we can internally fill in missing x values across all the stacked traces, and then there are perhaps three ways you might want to interpret gaps:

Treat them as 0 (like google sheets does) - this would arise for example if your data came from doing an SQL aggregation (sum of sales by month, for example) and there were no events to aggregate in some of the periods.
Interpolate (linearly?) across the gap. This is basically the equivalent of connectgaps: true, and would make sense in cases of incomplete data, for example you’re summing populations across different countries by year, you don’t have data for every country for every year but interpolating is a good assumption (certainly better than guessing zero population in those years!). This could get tricky if we want to take line.shape into account and try to make the (first) stacked trace look identical to its unstacked alternative. That’s probably not necessary though, at least to begin with we can interpolate linearly (which is also probably about the only interpolation option that preserves the total, independent of stacking order). But I think it probably is important to not display markers at the interpolated points.
Leave a gap. If you really want the plot to show what’s truly known, with no extra interpretation, then gaps should be left empty. This is equivalent to connectgaps: false. We could do complicated things with stacking gap-less traces on top of gapped traces, but since the point here is to not make any assumptions that aren’t explicit in the data, probably we’d want all gaps to propagate upward to the top of the stack.

So if the second and third cases are covered by connectgaps, what about the first (which, to fit with Google & Excel, should be the default)? I suppose it could be a new connectgaps: 'zero' or something? There would also be an argument for making this a separate setting, so that x values with an invalid y ('', null, non-numeric) would be treated differently from x values that get inserted just because they’re present in other data sets. Perhaps you’d like newly-inserted x values to get y=0 but invalid y to be treated as a gap?

I guess I can imagine cases where that would be the “most correct” way to display the data, though it might be more complexity than users really want. On the other hand making a new setting for this would allow us to avoid turning connectgaps into another “boolean plus a string” enumerated attribute, as well as avoiding extra logic around its default value. And mostly people would just use the default value of this new setting. So what could this new attribute be? How about stackgaps: ('zero' (dflt)|'gap'|'interpolate')?