question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

boxenplot area scale calculation

See original GitHub issue

The area method for calculating the width of boxenplot letter-value boxes is:

'area': lambda h, i, k: (1 - 2**(-k + i - 2)) / h}

in https://github.com/mwaskom/seaborn/blob/master/seaborn/categorical.py#L1890

IIUC, in order for the area to be proportional to the percentage of data covered, as documented (https://github.com/mwaskom/seaborn/blob/master/seaborn/categorical.py#L2672), the formula should rather be:

'area': lambda h, i, k: (1 - 2**(-k + i - 1)) / h}

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:25 (14 by maintainers)

github_iconTop GitHub Comments

1reaction
mwaskomcommented, Aug 30, 2020

Is there a compelling reason to use area? Could we just deprecate it?

0reactions
pierreDELANGENcommented, Nov 9, 2022

Hello, I’m digging up this issue, but when comparing “area” scaling to the original paper’s representation (Figure 3C, https://doi.org/10.1080/10618600.2017.1305277), it seems that the Seaborn implementation is still incorrect. Seaborn (random uniform): Screenshot from 2022-11-09 17-13-50 Expected result from the paper : Screenshot from 2022-11-09 17-15-09

For me the “area” scaling is the “correct” way of doing boxenplots as it is directly representative of the underlying PDF of the studied variable, but it still allows easy reading of quantiles and differences between multiple categories. It is a bit of an hybrid between an histogram and a boxplot.

Read more comments on GitHub >

github_iconTop Results From Across the Web

boxenplot area scale calculation - - Bountysource
The area method for calculating the width of boxenplot letter-value boxes is: 'area': lambda h, i, k: (1 - 2**(-k + i -...
Read more >
seaborn.boxenplot — seaborn 0.12.1 documentation - PyData |
scale {“exponential”, “linear”, “area”}, optional. Method to use for the width of the letter value boxes. All give similar results visually. “linear” reduces...
Read more >
seaborn.boxenplot — seaborn 0.9.0 documentation
scale : “linear” | “exponential” | “area”. Method to use for the width of the letter value boxes. All give similar results visually....
Read more >
Python - seaborn.boxenplot() method - GeeksforGeeks
scale : Method to use for the width of the letter value boxes. outlier_prop : Proportion of data believed to be outliers. showfliers...
Read more >
Letter-value plots: Boxplots for large data - Hadley Wickham
from 1341 (box #32) to 7865 (Box #13), with a median sample size of ... formula, SEfactor, for the first 20 letter values,...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found