question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

cast to float when using ``groupby.agg`` with function returning ``int`` on ``float`` input

See original GitHub issue

Code Sample, a copy-pastable example if possible

In [2]: df = pd.DataFrame([[1], [2], [3.3]])

In [3]: df.groupby([1,1,1]).agg(len)
Out[3]: 
     0
1  3.0

Problem description

The result of len should be int, regardless of the input. This is not specific to len: lambda x : 3 results in the same.

Expected Output

Compare to

In [4]: df.apply(len)
Out[4]: 
0    3
dtype: int64

In [5]: df.groupby([1,1,1]).apply(len)
Out[5]: 
1    3
dtype: int64

In [6]: df.astype(int).groupby([1,1,1]).agg(len)
Out[6]: 
   0
1  3

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: 9e7666dae3b3b10d987ce154a51c78bcee6e0728 python: 3.5.3.final.0 python-bits: 64 OS: Linux OS-release: 4.9.0-3-amd64 machine: x86_64 processor: byteorder: little LC_ALL: None LANG: it_IT.UTF-8 LOCALE: it_IT.UTF-8

pandas: 0.21.0.dev+265.g9e7666dae pytest: 3.0.6 pip: 9.0.1 setuptools: None Cython: 0.25.2 numpy: 1.12.1 scipy: 0.19.0 xarray: None IPython: 5.1.0.dev sphinx: 1.5.6 patsy: 0.4.1 dateutil: 2.6.0 pytz: 2017.2 blosc: None bottleneck: 1.2.1 tables: 3.3.0 numexpr: 2.6.1 feather: 0.3.1 matplotlib: 2.0.2 openpyxl: None xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.6 lxml: None bs4: 4.5.3 html5lib: 0.999999999 sqlalchemy: 1.0.15 pymysql: None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: 0.2.1

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
toobazcommented, Jul 20, 2017

you may have integers as a result, so you can try to downcast

My point was that the result of len is an int, and should never become a float, so there should be no need to downcast (the fact that input data was float should be irrelevant). But I will probably just need to look at the code to understand what you mean.

0reactions
rhshadrachcommented, Jul 5, 2020

Edit: The following comment is not relevant to the issue here since the handling of a string argument differs from that of a callable.

I ran into a similar issue with .transform(‘nunique’):

df = pd.DataFrame([[1, 1.1], [1, 3.1], [2, 1.1]], columns=['a', 'b'])
df.groupby('a').b.transform('nunique')

The resulting series I get are floats using 0.25.3. They become integers if column b values are integers, or if I replace .transform(‘nunique’) with merely .nunique().

Read more comments on GitHub >

github_iconTop Results From Across the Web

Stop Pandas from converting int to float due to an insertion in ...
I prefer using int instead of float because the actual data in that column are integers. If there's not workaround, I'll just use...
Read more >
Convert Floats to Integers in a Pandas DataFrame
Let us see how to convert float to integer in a Pandas DataFrame. We will be using the astype() method to do this....
Read more >
Pandas .groupby(), Lambda Function, & Pivot Table Tutorial
This lesson of the Python Tutorial for Data Analysis covers grouping data with pandas .groupby(), using lambda functions and pivot tables, and sorting...
Read more >
Aggregate Functions
Returns an integer value based on its parameters. It can be used to simplify a query that needs many GROUP BY levels by...
Read more >
Pandas Convert Column to Int in DataFrame
Now by using the same approaches using astype() let's convert the float column to int (integer) type in pandas DataFrame. Note that while...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found