question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Setting hue in pairplot on a string column with only 1 item in a category throws exception: ValueError: `dataset` input should have multiple elements.

See original GitHub issue

seaborn 0.9.0, installed via pip.

I have 10 rows, trying to create pairplot. The plot works fine until I set the hue to a string (object) column that has 4 categories with the breakdown of (4, 3, 2, 1). Stack trace below:

ValueError                                Traceback (most recent call last)
<ipython-input-62-cc1e7428015b> in <module>()
      8 # ds.dtypes
      9 # ds
---> 10 sns.pairplot(ds, vars=['mark', 'days', 'hours', 'refs'], hue='topic')
     11 # dataset.iloc[0:10,[8,19]]
     12 

/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/axisgrid.pyc in pairplot(data, hue, hue_order, palette, vars, x_vars, y_vars, kind, diag_kind, markers, height, aspect, dropna, plot_kws, diag_kws, grid_kws, size)
   2109             diag_kws.setdefault("shade", True)
   2110             diag_kws["legend"] = False
-> 2111             grid.map_diag(kdeplot, **diag_kws)
   2112     
   2113     # Maybe plot on the off-diagonals

/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/axisgrid.pyc in map_diag(self, func, **kwargs)
   1397                     color = fixed_color
   1398                 
-> 1399                 func(data_k, label=label_k, color=color, **kwargs)
   1400             
   1401             self._clean_axis(ax)

/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/distributions.pyc in kdeplot(data, data2, shade, vertical, kernel, bw, gridsize, cut, clip, legend, cumulative, shade_lowest, cbar, cbar_ax, cbar_kws, ax, **kwargs)
    689         ax = _univariate_kdeplot(data, shade, vertical, kernel, bw,
    690                                  gridsize, cut, clip, legend, ax,
--> 691                                  cumulative=cumulative, **kwargs)
    692     
    693     return ax

/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/distributions.pyc in _univariate_kdeplot(data, shade, vertical, kernel, bw, gridsize, cut, clip, legend, ax, cumulative, **kwargs)
    292                               "only implemented in statsmodels."
    293                               "Please install statsmodels.")
--> 294         x, y = _scipy_univariate_kde(data, bw, gridsize, cut, clip)
    295 
    296     # Make sure the density is nonnegative

/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/distributions.pyc in _scipy_univariate_kde(data, bw, gridsize, cut, clip)
    364     """Compute a univariate kernel density estimate using scipy."""
    365     try:
--> 366         kde = stats.gaussian_kde(data, bw_method=bw)
    367     except TypeError:
    368         kde = stats.gaussian_kde(data)

/Users/piglet/Library/Python/2.7/lib/python/site-packages/scipy/stats/kde.pyc in __init__(self, dataset, bw_method)
    167         self.dataset = atleast_2d(dataset)
    168         if not self.dataset.size > 1:
--> 169             raise ValueError("`dataset` input should have multiple elements.")
    170 
    171         self.d, self.n = self.dataset.shape

ValueError: `dataset` input should have multiple elements.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:13 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
adam-ahcommented, Dec 20, 2018

I wouldn’t agree with this statement.

I must have misread your argument because surely you aren’t saying that a plotting library should throw a random exception from a third party library because it was asked to color in dots?

You can throw a warning, saying that the diagram diagonal isn’t using colours, because of whatever reason.

You can throw an error, saying that ‘kde’ diagonal requires at least two elements in each group (for whatever reason), and say that you can use the ‘hist’ parameter to draw a histogram instead of kde.

But what you cannot do is to let a third party library throw an arbitrary exception because seaborn didn’t check the required parameters the library needs to do its thing.

The issue here isn’t with seaborn, the issue is with the underlying math.

Just like we cannot blame calling toString() on a null value, or blame sqrt() when passing in a negative value, we cannot blame a third party library when we are feeding in incorrect parameters.

EDIT: With the .pairplot API the user never actually asked for a kde plot, only for a pair-plot, and seaborn decided to use a kde plot; it’s difficult to reason that it is a user error either. The data can be plotted and colored in, so overall it would be hard to conclude that it is not a bug.

0reactions
mwaskomcommented, Dec 20, 2018

Well, you’ve failed in that aim, but succeeded in irritating me to the point that I’ve moved this issue to the bottom of the stack of things I choose to spend time working on. Congrats!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Seaborn the hue attribute causing error in plots - Stack Overflow
You can see the error: ValueError: object arrays are not supported. Means the variable needs to be numerical.
Read more >
seaborn.pairplot — seaborn 0.12.1 documentation - PyData |
If a dict, keys should be values in the hue variable. varslist of variable names. Variables within data to use, otherwise use every...
Read more >
Your First Machine Learning Project in Python Step-By-Step
Attributes are numeric so you have to figure out how to load and handle data. It is a classification problem, allowing you to...
Read more >
5-extracurricular - UofT Coders
I like that this shows the beginning and the end of the data frame, as well as the dimensions (which would not show...
Read more >
TOPCAT - Tool for OPerations on Catalogues And Tables
A.10.1.1 Use Sky Coordinates in TOPCAT; A.10.1.2 Send Sky ... but other values may be sortable too, for instance a String column will...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found