Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DataFrameGroupBy.aggregate can not work with `tuple` as an argument

See original GitHub issue

The following code raises ValueError

grouped_df = df.groupby(group_by_attributes, as_index=False).aggregate(tuple)

Here is a more replicatable version:

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 3), columns=list('ABC'))
grouped_df = df.groupby(['A', 'B'], as_index=False).aggregate(tuple)

Problem description

The statement above does not work because tuple is not a function. It throws: ValueError: no results

Workaround

use the following groupby statement instead

grouped_df = df.groupby(['A', 'B'], as_index=False).aggregate(lambda x: tuple(x))

This was issued as a result of the following discussion: https://github.com/PyCQA/pylint/issues/1709#issuecomment-341095096

Expected Output

Should be able to work without raising a ValueError

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None python: 2.7.12.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-97-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: en_US.UTF-8 LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.20.3 pytest: None pip: 8.1.2 setuptools: 28.2.0 Cython: 0.24.1 numpy: 1.13.3 scipy: 0.18.1 xarray: None IPython: 5.4.1 sphinx: 1.4.8 patsy: 0.4.1 dateutil: 2.5.3 pytz: 2016.7 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 1.5.3 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 1.0b10 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.8 s3fs: None pandas_gbq: None pandas_datareader: None

Issue Analytics

State:
Created 6 years ago
Comments:7 (7 by maintainers)

Top GitHub Comments

1reaction

bobhaffnercommented, Nov 2, 2017

Hi @sinanonur, your example works in the newly released 0.21.0. Please upgrade when you can

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(100, 3), columns=list('ABC'))
grouped_df = df.groupby(['A', 'B'], as_index=False).aggregate(tuple)
print(pd.__version__)
print(grouped_df.head())

0.21.0
          A         B                      C
0 -2.159796 -1.233800  (1.0704251018658486,)
1 -1.947438  2.082122  (0.5849118717358551,)
2 -1.738639  0.653051  (1.1259850203053805,)
3 -1.638240 -0.799216  (0.3626490086583796,)
4 -1.562435 -0.232689  (-1.120885109955278,)

0reactions

gfyoungcommented, Nov 4, 2017