df.agg gives a TypeError when sending more than one argument with
See original GitHub issueCode Sample
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,30,size=(30, 4)), columns=list('ABCD'))
df = df.set_index(pd.date_range('2018-04-18 06:00:00', '2018-04-22 06:00:00', periods=30))
df['B'][5:15] = np.nan
df.agg('sum', axis=1) # this works
df.agg('sum', skipna=False) # this works
# but I want both
df.agg('sum', axis=1, skipna=False) # this breaks
# TypeError: ("'str' object is not callable", 'occurred at index 2018-04-18 06:00:00')
Problem description
I need to pass both arguments to sum
, but giving both gives an error. It works separately.
I need to specify the aggfunc
as a “variable”, since it won’t always be the same.
This is a working alternative, but feels hacky when a built-in method is there to do the job:
getattr(df, 'sum')(axis=1, skipna=False)
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.6.5.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 142 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None
pandas: 0.23.3 pytest: 3.6.3 pip: 18.0 setuptools: 40.0.0 Cython: 0.28.4 numpy: 1.14.5 scipy: 1.1.0 pyarrow: None xarray: None IPython: 6.4.0 sphinx: 1.7.5 patsy: 0.5.0 dateutil: 2.7.3 pytz: 2018.5 blosc: None bottleneck: 1.2.1 tables: 3.4.4 numexpr: 2.6.5 feather: None matplotlib: 2.2.2 openpyxl: 2.5.4 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.0.5 lxml: 4.2.3 bs4: 4.6.0 html5lib: 1.0.1 sqlalchemy: 1.2.10 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
Thanks for verifying @felixDulys! Yes, appears you’re right that this is already tested and that we can close this issue.
Happy for you to help on other
good first issue
+needs tests
open issues!I am new to contributing to the pandas codebase, so please excuse any noob mistakes 😃
Conclusion: It looks to me like this combination is already being tested here.
Thought process on this: In
pandas/tests/frame/test_reductions.py::TestDataFrameAnalytics::test_stat_op_calc
, we are able to test several reductions directly on a DataFrame with various na handling techniques when calling onassert_stat_op_calc()
(here). We can specifysum
as the operation to test (opname
) and the flaghas_skipna
(defaultTrue
) assures that we test this functionality whenskipna=False
(here).I propose we close this issue since it appears to work in master (verified +1) and it already has test coverage. Happy to still work on this if I am misunderstanding how this works and coverage is still needed!