categorical does not work with Pandas DataFrames
See original GitHub issueIn [9]: sm.categorical(pd.DataFrame({'a':[1,2,12], 'b':['a','b','a']}), col='a')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-9-5966a3ee6951> in <module>()
----> 1 sm.categorical(pd.DataFrame({'a':[1,2,12], 'b':['a','b','a']}), col='a')
/usr/local/lib/python2.7/dist-packages/statsmodels-0.6.0-py2.7-linux-x86_64.egg/statsmodels/tools/tools.pyc in categorical(data, col, dictnames, drop)
143 #TODO: add a NameValidator function
144 # catch recarrays and structured arrays
--> 145 if data.dtype.names or data.__class__ is np.recarray:
146 if not col and np.squeeze(data).ndim > 1:
147 raise IndexError("col is None and the input array is not 1d")
/usr/local/lib/python2.7/dist-packages/pandas-0.13.0_285_gfcfaa7d-py2.7-linux-x86_64.egg/pandas/core/generic.pyc in __getattr__(self, name)
1802 return self[name]
1803 raise AttributeError("'%s' object has no attribute '%s'" %
-> 1804 (type(self).__name__, name))
1805
1806 def __setattr__(self, name, value):
AttributeError: 'DataFrame' object has no attribute 'dtype'
an alternative is to use pd.get_dummies
:
pd.get_dummies(pd.DataFrame({'a':[1,2,12], 'b':['a','b','a']}).b )
Issue Analytics
- State:
- Created 10 years ago
- Comments:11 (6 by maintainers)
Top Results From Across the Web
Categorical data — pandas 1.5.2 documentation
Categoricals are a pandas data type corresponding to categorical variables in statistics. A categorical variable takes on a limited, and usually fixed, number ......
Read more >Pandas Categorical data type not behaving as expected
The name "Categorical" implies there is no ordering. If there were an ordering, the data would be ordinal, not categorical. – BrenBarn. Apr...
Read more >Using pandas categories properly is tricky... here's why
By default when grouping by on categorical columns, pandas returns a result for each value in the category, even when not present in...
Read more >Using The Pandas Category Data Type
While categorical data is very handy in pandas. It is not necessary for every type of analysis. In fact, there can be some...
Read more >Methods For Categorical Data in Pandas - Medium
Methods you should know to work with categorical data in Pandas ... You may have categorical data in your dataset. A categorical data...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
categorical
still doesn’t work with a DataFrame, which is a bit annoying for updating older code. e.g. test_glm #4432categorical
is still nice when we quickly want to get an exog without going through the formulas or the full pandas categorical type.I have submitted a PR #3203 on this issue. It should handle DataFrames as well as other data formats.