question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Categorical of booleans has object .categories

See original GitHub issue

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [1]: import pandas as pd
   ...: 
   ...: obs = pd.DataFrame({"int_cat":[1,2,3], "bool_cat":[True,False,False]}, dtype="category")
   ...: obs["bool_cat"].cat.categories.dtype
Out[1]: dtype('O')

In [2]: bools = pd.Series([True, False, True])
   ...: bools.dtype
Out[2]: dtype('bool')

In [3]: bools.astype("category").cat.categories.dtype
Out[3]: dtype('O')

In [4]: pd.__version__
Out[4]: '1.4.1'

Issue Description

I would assume a categorical generated from a boolean array would have bool category values, not object.

This is weird, but it’s admittedly a funny use case. Upstream of https://github.com/theislab/anndata/issues/724

Expected Behavior

I would have expected the dtype of the categories to be boolean.

Installed Versions

``` ------------------ commit : 06d230151e6f18fdb8139d09abf539867a8cd481 python : 3.9.10.final.0 python-bits : 64 OS : Darwin OS-release : 20.6.0 Version : Darwin Kernel Version 20.6.0: Wed Nov 10 22:23:07 PST 2021; root:xnu-7195.141.14~1/RELEASE_X86_64 machine : x86_64 processor : i386 byteorder : little LC_ALL : None LANG : None LOCALE : None.UTF-8

pandas : 1.4.1 numpy : 1.21.5 pytz : 2021.3 dateutil : 2.8.2 pip : 22.0.3 setuptools : 60.5.0 Cython : 0.29.25 pytest : 7.0.1 hypothesis : None sphinx : 4.1.2 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.3 IPython : 8.1.0 pandas_datareader: None bs4 : None bottleneck : None fastparquet : None fsspec : 2022.02.0 gcsfs : None matplotlib : 3.5.1 numba : 0.55.1 numexpr : 2.8.1 odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : 7.0.0 pyreadstat : None pyxlsb : None s3fs : None scipy : 1.8.0 sqlalchemy : 1.4.31 tables : 3.7.0 tabulate : 0.8.9 xarray : 0.21.1 xlrd : 1.2.0 xlwt : None zstandard : None


</details>

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
phoflcommented, Mar 11, 2022

This is fixed on main.

May need tests

0reactions
Kyrpelcommented, Mar 18, 2022

Hi my name is matias, im from argentina, im new to open-source community and would like to try it. so tell me if i can help you with this.

Hi Matias. If you find an issue that interests you, just write in the comments “take” and the issue will be assigned to you, so other contributors know about it. If you want to drop the issue just press “unassign me” at the right of the issue description

Read more comments on GitHub >

github_iconTop Results From Across the Web

BUG: Categorical of booleans has object .categories - GitHub
Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, ...
Read more >
Bug descriptions — spotbugs 4.7.3 documentation
A method that returns either Boolean.TRUE, Boolean.FALSE or null is an accident waiting to happen. This method can be invoked as though it...
Read more >
Categorical data — pandas 1.5.2 documentation
Categorical data has a categories and a ordered property, which list their possible values and whether the ordering matters or not. These properties...
Read more >
python - Booleans have two possible values. Are there types ...
In Python I'd do that with a wrapper object that holds one of those three values; I'd use True , False , and...
Read more >
Array that contains values assigned to categories - MATLAB
B = categorical( A , valueset ) creates one category for each value in valueset . The categories of B are in the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found