question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: grouped.last() will sometimes turn a boolean column into Int64

See original GitHub issue

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import pandas as pd

df = pd.DataFrame(
    {
        'id': [1, 2, 3, 4],
        'test': [True, pd.NA, pd.NA, False]
    }
).convert_dtypes()

grouped = df.groupby('id')
bad = grouped.last()
assert bad.test.dtype == pd.BooleanDtype() # fails

Issue Description

On the latest master this returns an Int64 column.

    test
id
1      1
2   <NA>
3   <NA>
4      0

I checked 1.4.0 and it properly returns the boolean dtype.

     test
id
1    True
2    <NA>
3    <NA>
4   False

What is weird is that changing 3. to True/False will give the proper dtype.

Expected Behavior

Retain the boolean dtype from df.test.

Installed Versions

INSTALLED VERSIONS

commit : 663147edd35bc3e0362f7d637c8d5f5e597f961b python : 3.10.0.final.0 python-bits : 64 OS : Linux OS-release : 5.16.12-arch1-1 Version : #1 SMP PREEMPT Wed, 02 Mar 2022 12:22:51 +0000 machine : x86_64 processor : byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.5.0.dev0+545.g663147edd3 numpy : 1.21.5 pytz : 2021.3 dateutil : 2.8.2 pip : 21.3.1 setuptools : 58.5.3 Cython : 0.29.28 pytest : 6.2.5 hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.7.1 html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.3 IPython : 7.29.0 pandas_datareader: None bs4 : 4.10.0 bottleneck : None brotli : None fastparquet : None fsspec : None gcsfs : None markupsafe : 2.0.1 matplotlib : None numba : 0.55.1 numexpr : None odfpy : None openpyxl : 3.0.9 pandas_gbq : None pyarrow : 8.0.0.dev230+gb2ae3d74d pyreadstat : None pyxlsb : None s3fs : None scipy : 1.8.0 snappy : None sqlalchemy : 2.0.0b1 tables : None tabulate : 0.8.9 xarray : None xlrd : None xlwt : None zstandard : None

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
ayeshaamcommented, Apr 5, 2022

take

1reaction
sydneyehcommented, Apr 5, 2022

take

Read more comments on GitHub >

github_iconTop Results From Across the Web

BUG: grouped.last() will sometimes turn a boolean column into Int64 ...
BUG : grouped.last() will sometimes turn a boolean column into Int64 BUG: grouped.last() will sometimes turn a boolean column into Int64.
Read more >
pandas data frame transform INT64 columns to boolean
Show activity on this post. Some column in dataframe df, df. column, is stored as datatype int64. The values are all 1s or...
Read more >
What's new in 1.5.0 (September 19, 2022) - Pandas
plot() will now allow the subplots parameter to be a list of iterables specifying column groups, so that columns may be grouped together...
Read more >
Query syntax | BigQuery - Google Cloud
SELECT * , often referred to as select star, produces one output column for each column that is visible after executing the full...
Read more >
Different Ways to Change Data Type in pandas
Below are some quick examples of converting column data type on Pandas DataFrame. ... convert_dtypes() is available in Pandas DataFrame since version 1.0.0, ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found