Some categorical plots fail when y is datetime
See original GitHub issueCategorical plots are not interchangeable when the dependent variable is np.datetime. Apart from stripplot and swarmplot they all may fail. Consider the following code:
import seaborn as sns
import numpy as np
dates = np.arange('2020-03', '2020-04', dtype='datetime64[D]')
cats = np.repeat("cat", len(dates))
for func in [sns.stripplot, sns.swarmplot, sns.boxplot, sns.boxenplot, sns.violinplot, sns.barplot]:
try:
func(cats, dates)
except Exception as ex:
print(str(func), type(ex), ex)
The above code results with the following lines printed:
<function boxplot at 0x7f520e7af598> <class 'numpy.core._exceptions.UFuncTypeError'> ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('<M8[ns]')
<function boxenplot at 0x7f520e7af730> <class 'numpy.core._exceptions.UFuncTypeError'> ufunc 'multiply' cannot use operands with types dtype('<M8[ns]') and dtype('float64')
<function violinplot at 0x7f520e7af620> <class 'TypeError'> invalid type promotion
<function barplot at 0x7f520e7af8c8> <class 'numpy.core._exceptions.UFuncTypeError'> ufunc 'add' cannot use operands with types dtype('<M8[ns]') and dtype('<M8[ns]')
The problem is that datetime variable is not identified as categorical and also not identified as non-numeric and therefore specific plotters attempt to perform operations that expect numeric input. Categorical plots docstrings currently don’t mention datetime y (only datetime x).
Probably the easiest fix to maintain consistency between all categorical plots would be changing the following method to identify dates as non-numeric (perhaps using pandas.api.types.is_numeric_dtype). https://github.com/mwaskom/seaborn/blob/562fd90216d5529ab8f26b388168e6ef260d7e6c/seaborn/categorical.py#L330-L335 However this will also disable stripplot and swarmplot with dates.
Issue Analytics
- State:
- Created 3 years ago
- Comments:6 (5 by maintainers)
Top GitHub Comments
I think I’m having a similar issue: I have a dataframe with a
country
categorical column, and acreated_at
datetime column. Usingcatplot(data, kind='swarm', x='country', y='created_at')
works fine, but when I usekind='violin'
, Seaborn complains with the aforementionedTypeError: Neither the
xnor
yvariable appears to be numeric
. I would expect otherwise, because I can still e.g. plot a distribution for all dates usingdistplot
, and a violin plot does essentially that, just per category.I’ll close this as the reason for failures is now clear for the user. Will open a separate issue with an example regarding qualitative colormap when hue is datetime. Thanks!