Additional calls to function when using GroupBy.apply with multiple index
See original GitHub issueCode Sample, a copy-pastable example if possible
import pandas as pd
df = pd.DataFrame([[1, 2, 3], [2, 2, 3]], columns=['a', 'b', 'c'])
def t(df):
print(df)
print('-' * 10)
return None
g = df.groupby(by=['a', 'b'])
g.apply(t)
print(g.groups)
g.apply(lambda x:print(x.index))
Problem description
The function is invoked by ‘apply’ three times while there are only two groups.
Expected Output
The function is only invoked once by ‘apply’ for each group.
Output of pd.show_versions()
[paste the output of pd.show_versions()
here below this line]
INSTALLED VERSIONS
commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-119-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_HK.UTF-8 LOCALE: en_HK.UTF-8
pandas: 0.21.0 pytest: 3.2.1 pip: 9.0.1 setuptools: 36.4.0 Cython: None numpy: 1.14.3 scipy: None pyarrow: None xarray: None IPython: None sphinx: None patsy: None dateutil: 2.7.3 pytz: 2018.4 blosc: None bottleneck: None tables: None numexpr: 2.6.5 feather: None matplotlib: None openpyxl: 2.4.9 xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.999999999 sqlalchemy: 1.1.15 pymysql: None psycopg2: 2.7.3.1 (dt dec pq3 ext lo64) jinja2: 2.9.6 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (5 by maintainers)
That is already mentioned there, and for now #2936 is kept open as the issue to potentially improve this (add a keyword to specify the path instead of trying / calling the function twice, …)
@jorisvandenbossche : Does a similar comment apply to #2936?