question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataFrameGroupBy.aggregate can not work with `tuple` as an argument

See original GitHub issue

The following code raises ValueError

grouped_df = df.groupby(group_by_attributes, as_index=False).aggregate(tuple)

Here is a more replicatable version:

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 3), columns=list('ABC'))
grouped_df = df.groupby(['A', 'B'], as_index=False).aggregate(tuple)

Problem description

The statement above does not work because tuple is not a function. It throws: ValueError: no results

Workaround

use the following groupby statement instead

grouped_df = df.groupby(['A', 'B'], as_index=False).aggregate(lambda x: tuple(x))

This was issued as a result of the following discussion: https://github.com/PyCQA/pylint/issues/1709#issuecomment-341095096

Expected Output

Should be able to work without raising a ValueError

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-97-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.20.3 pytest: None pip: 8.1.2 setuptools: 28.2.0 Cython: 0.24.1 numpy: 1.13.3 scipy: 0.18.1 xarray: None IPython: 5.4.1 sphinx: 1.4.8 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.7 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 1.5.3 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 1.0b10 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8 s3fs: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
bobhaffnercommented, Nov 2, 2017

Hi @sinanonur, your example works in the newly released 0.21.0. Please upgrade when you can

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 3), columns=list('ABC'))
grouped_df = df.groupby(['A', 'B'], as_index=False).aggregate(tuple)
print(pd.__version__)
print(grouped_df.head())

0.21.0
          A         B                      C
0 -2.159796 -1.233800  (1.0704251018658486,)
1 -1.947438  2.082122  (0.5849118717358551,)
2 -1.738639  0.653051  (1.1259850203053805,)
3 -1.638240 -0.799216  (0.3626490086583796,)
4 -1.562435 -0.232689  (-1.120885109955278,)
0reactions
gfyoungcommented, Nov 4, 2017

Not sure why the if is needed in the first place?

Try removing and see what happens. When there’s a bug in our code, all bets are (almost) off 😄

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas: Aggregate of DataFrameGroupby - Stack Overflow
The argument given to agg can be (1) string (function name) (2) function (3) list of functions (4) dict of column names ->...
Read more >
pandas.core.groupby.DataFrameGroupBy.aggregate
Transforms the Series on each group based on the given function. Notes. When using engine='numba' , there will be no “fall back” behavior...
Read more >
Comprehensive Guide to Grouping and Aggregating with ...
In pandas, the groupby function can be combined with one or more aggregation functions to quickly and easily summarize data.
Read more >
Group and Aggregate by One or More Columns in Pandas
First we'll group by Team with Pandas' groupby function. After grouping we can pass aggregation functions to the grouped object as a dictionary...
Read more >
Group and Aggregate your Data Better using Pandas Groupby
Aggregation and grouping of Dataframes is accomplished in Python Pandas using ... Python tuples are used to provide the column name on which...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found