BUG: Categorical of booleans has object .categories
See original GitHub issuePandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
In [1]: import pandas as pd
...:
...: obs = pd.DataFrame({"int_cat":[1,2,3], "bool_cat":[True,False,False]}, dtype="category")
...: obs["bool_cat"].cat.categories.dtype
Out[1]: dtype('O')
In [2]: bools = pd.Series([True, False, True])
...: bools.dtype
Out[2]: dtype('bool')
In [3]: bools.astype("category").cat.categories.dtype
Out[3]: dtype('O')
In [4]: pd.__version__
Out[4]: '1.4.1'
Issue Description
I would assume a categorical generated from a boolean array would have bool
category values, not object
.
This is weird, but it’s admittedly a funny use case. Upstream of https://github.com/theislab/anndata/issues/724
Expected Behavior
I would have expected the dtype of the categories to be boolean.
Installed Versions
pandas : 1.4.1 numpy : 1.21.5 pytz : 2021.3 dateutil : 2.8.2 pip : 22.0.3 setuptools : 60.5.0 Cython : 0.29.25 pytest : 7.0.1 hypothesis : None sphinx : 4.1.2 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.3 IPython : 8.1.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None fsspec : 2022.02.0 gcsfs : None matplotlib : 3.5.1 numba : 0.55.1 numexpr : 2.8.1 odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : 7.0.0 pyreadstat : None pyxlsb : None s3fs : None scipy : 1.8.0 sqlalchemy : 1.4.31 tables : 3.7.0 tabulate : 0.8.9 xarray : 0.21.1 xlrd : 1.2.0 xlwt : None zstandard : None
</details>
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:6 (4 by maintainers)
Top GitHub Comments
This is fixed on main.
May need tests
Hi Matias. If you find an issue that interests you, just write in the comments “take” and the issue will be assigned to you, so other contributors know about it. If you want to drop the issue just press “unassign me” at the right of the issue description