question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Feature Request: Truncated Distributions for Violin Plots

See original GitHub issue

I would like to generate violin plots for truncated distributions, e.g. for efficiency scores which are always between 0 and 100%. My current approach is to use the parameter cut=0 when calling sns.violinplot, but I think that a more informative approach is to reflect the density at the truncation point, so that, for example, the area which would be drawn below zero in an unrestricted kde will appear above zero in the truncated version.

Here is a little example that illustrates my concern and a potential solution:

import numpy as np, pandas as pd, pymc as pm, matplotlib.pyplot as plt, seaborn as sns
%matplotlib inline

np.random.seed(12345)
df = pd.DataFrame(np.random.normal(size=(10,3)).clip(0,5))

sns.violinplot(data=df)

image

Note the disturbing non-zero density on negative values. Fixing this with cut=0 looks like this:

sns.violinplot(data=df, cut=0)

image

No more positive density outside the support of the data. But this truncated normal should have maximum density at zero, and the feature I am requesting is a way to ask for that. Here is a very hacky way to get something that would satisfy me:

t = sns.categorical._ViolinPlotter.fit_kde
def reflected_once_kde(self, x, bw):
    kde, bw_used = t(self, x, bw)

    kde_evaluate = kde.evaluate

    def zero_to_five_truncated_kde_evaluate(x):
        val = kde_evaluate(x)
        val += kde_evaluate(-x)
        val += kde_evaluate(5-(x-5))
        return np.where((x<0)|(x>5), 0, val)

    kde.evaluate = zero_to_five_truncated_kde_evaluate
    return kde, bw_used

sns.categorical._ViolinPlotter.fit_kde = reflected_once_kde
sns.violinplot(data=df, cut=0)

image

There is a previous feature request that asks for something similar at #244 which was closed when the implementation was overhauled in #410. Perhaps @PierreBdR or @mwaskom has some input about if and how my feature should be implemented.

I am up for doing some amount of work on this if it would be a welcome addition to Seaborn.

Issue Analytics

  • State:closed
  • Created 8 years ago
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
aflaxmancommented, Apr 30, 2015

Thanks for your work on this. In case anyone needs this sort of plot before the truncated KDE is finished, here is the monkey patch madness that I used in the end:

fit_kde_func = sns.categorical._ViolinPlotter.fit_kde

def reflected_once_kde(self, x, bw):
    lb=0
    ub=1

    kde, bw_used = fit_kde_func(self, x, bw)

    kde_evaluate = kde.evaluate

    def truncated_kde_evaluate(x):
        val = np.where((x>=lb)&(x<=ub), kde_evaluate(x), 0)
        val += np.where((x>=lb)&(x<=ub), kde_evaluate(lb-x), 0)
        val += np.where((x>lb)&(x<=ub), kde_evaluate(ub-(x-ub)), 0)
        return val

    kde.evaluate = truncated_kde_evaluate
    return kde, bw_used

sns.categorical._ViolinPlotter.fit_kde = reflected_once_kde
sns.violinplot(np.random.normal(size=10).clip(0,np.inf), cut=0, inner=None)

It made my violins look like gyro meat, which I kind of like: image

2reactions
rrazaghicommented, Sep 19, 2018

is there any updates or ways to do this? I would like to have something like the clip option in kdeplot for violinplot so that I don’t have to move to ggplot. Thanks

Read more comments on GitHub >

github_iconTop Results From Across the Web

A Complete Guide to Violin Plots | Tutorial by Chartio
Violin plots are used to compare the distribution of data between groups. Learn how violin plots are constructed and how to use them...
Read more >
Violin Plots 101: Visualizing Distribution and Probability Density
A violin plot is a hybrid of a box plot and a kernel density plot, which shows peaks in the data. It is...
Read more >
How to better fit seaborn violinplots? - Stack Overflow
Another answer might be to truncate the violin at the extremes of the datapoints. The KDE will still be fit with densities that...
Read more >
Prism 8.4.0 Release Notes - GraphPad
With Prism 8.0, Violin plots were introduced as a way to visually approximate the distribution of a data set. Prior to this release,...
Read more >
Should the violin plot extend past 0 if the lowest value is 0?
Violinplot is just a kernel-density estimate of distribution out of sampled data points, plotted back to back, with some sugar on top (mean, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found