Setting hue in pairplot on a string column with only 1 item in a category throws exception: ValueError: `dataset` input should have multiple elements.
See original GitHub issueseaborn 0.9.0, installed via pip.
I have 10 rows, trying to create pairplot. The plot works fine until I set the hue to a string (object) column that has 4 categories with the breakdown of (4, 3, 2, 1). Stack trace below:
ValueError Traceback (most recent call last)
<ipython-input-62-cc1e7428015b> in <module>()
8 # ds.dtypes
9 # ds
---> 10 sns.pairplot(ds, vars=['mark', 'days', 'hours', 'refs'], hue='topic')
11 # dataset.iloc[0:10,[8,19]]
12
/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/axisgrid.pyc in pairplot(data, hue, hue_order, palette, vars, x_vars, y_vars, kind, diag_kind, markers, height, aspect, dropna, plot_kws, diag_kws, grid_kws, size)
2109 diag_kws.setdefault("shade", True)
2110 diag_kws["legend"] = False
-> 2111 grid.map_diag(kdeplot, **diag_kws)
2112
2113 # Maybe plot on the off-diagonals
/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/axisgrid.pyc in map_diag(self, func, **kwargs)
1397 color = fixed_color
1398
-> 1399 func(data_k, label=label_k, color=color, **kwargs)
1400
1401 self._clean_axis(ax)
/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/distributions.pyc in kdeplot(data, data2, shade, vertical, kernel, bw, gridsize, cut, clip, legend, cumulative, shade_lowest, cbar, cbar_ax, cbar_kws, ax, **kwargs)
689 ax = _univariate_kdeplot(data, shade, vertical, kernel, bw,
690 gridsize, cut, clip, legend, ax,
--> 691 cumulative=cumulative, **kwargs)
692
693 return ax
/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/distributions.pyc in _univariate_kdeplot(data, shade, vertical, kernel, bw, gridsize, cut, clip, legend, ax, cumulative, **kwargs)
292 "only implemented in statsmodels."
293 "Please install statsmodels.")
--> 294 x, y = _scipy_univariate_kde(data, bw, gridsize, cut, clip)
295
296 # Make sure the density is nonnegative
/Users/piglet/Library/Python/2.7/lib/python/site-packages/seaborn/distributions.pyc in _scipy_univariate_kde(data, bw, gridsize, cut, clip)
364 """Compute a univariate kernel density estimate using scipy."""
365 try:
--> 366 kde = stats.gaussian_kde(data, bw_method=bw)
367 except TypeError:
368 kde = stats.gaussian_kde(data)
/Users/piglet/Library/Python/2.7/lib/python/site-packages/scipy/stats/kde.pyc in __init__(self, dataset, bw_method)
167 self.dataset = atleast_2d(dataset)
168 if not self.dataset.size > 1:
--> 169 raise ValueError("`dataset` input should have multiple elements.")
170
171 self.d, self.n = self.dataset.shape
ValueError: `dataset` input should have multiple elements.
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
Seaborn the hue attribute causing error in plots - Stack Overflow
You can see the error: ValueError: object arrays are not supported. Means the variable needs to be numerical.
Read more >seaborn.pairplot — seaborn 0.12.1 documentation - PyData |
If a dict, keys should be values in the hue variable. varslist of variable names. Variables within data to use, otherwise use every...
Read more >Your First Machine Learning Project in Python Step-By-Step
Attributes are numeric so you have to figure out how to load and handle data. It is a classification problem, allowing you to...
Read more >5-extracurricular - UofT Coders
I like that this shows the beginning and the end of the data frame, as well as the dimensions (which would not show...
Read more >TOPCAT - Tool for OPerations on Catalogues And Tables
A.10.1.1 Use Sky Coordinates in TOPCAT; A.10.1.2 Send Sky ... but other values may be sortable too, for instance a String column will...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
I must have misread your argument because surely you aren’t saying that a plotting library should throw a random exception from a third party library because it was asked to color in dots?
You can throw a warning, saying that the diagram diagonal isn’t using colours, because of whatever reason.
You can throw an error, saying that ‘kde’ diagonal requires at least two elements in each group (for whatever reason), and say that you can use the ‘hist’ parameter to draw a histogram instead of kde.
But what you cannot do is to let a third party library throw an arbitrary exception because seaborn didn’t check the required parameters the library needs to do its thing.
Just like we cannot blame calling toString() on a null value, or blame sqrt() when passing in a negative value, we cannot blame a third party library when we are feeding in incorrect parameters.
EDIT: With the .pairplot API the user never actually asked for a kde plot, only for a pair-plot, and seaborn decided to use a kde plot; it’s difficult to reason that it is a user error either. The data can be plotted and colored in, so overall it would be hard to conclude that it is not a bug.
Well, you’ve failed in that aim, but succeeded in irritating me to the point that I’ve moved this issue to the bottom of the stack of things I choose to spend time working on. Congrats!