question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

df.agg gives a TypeError when sending more than one argument with

See original GitHub issue

Code Sample

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(0,30,size=(30, 4)), columns=list('ABCD'))
df = df.set_index(pd.date_range('2018-04-18 06:00:00', '2018-04-22 06:00:00', periods=30))
df['B'][5:15] = np.nan

df.agg('sum', axis=1)  # this works
df.agg('sum', skipna=False)  # this works
# but I want both
df.agg('sum', axis=1, skipna=False)  # this breaks
# TypeError: ("'str' object is not callable", 'occurred at index 2018-04-18 06:00:00')

Problem description

I need to pass both arguments to sum, but giving both gives an error. It works separately. I need to specify the aggfunc as a “variable”, since it won’t always be the same.

This is a working alternative, but feels hacky when a built-in method is there to do the job:

getattr(df, 'sum')(axis=1, skipna=False)

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.6.5.final.0 python-bits: 64 OS: Windows OS-release: 10 machine: AMD64 processor: Intel64 Family 6 Model 142 Stepping 9, GenuineIntel byteorder: little LC_ALL: None LANG: None LOCALE: None.None

pandas: 0.23.3 pytest: 3.6.3 pip: 18.0 setuptools: 40.0.0 Cython: 0.28.4 numpy: 1.14.5 scipy: 1.1.0 pyarrow: None xarray: None IPython: 6.4.0 sphinx: 1.7.5 patsy: 0.5.0 dateutil: 2.7.3 pytz: 2018.5 blosc: None bottleneck: 1.2.1 tables: 3.4.4 numexpr: 2.6.5 feather: None matplotlib: 2.2.2 openpyxl: 2.5.4 xlrd: 1.1.0 xlwt: 1.3.0 xlsxwriter: 1.0.5 lxml: 4.2.3 bs4: 4.6.0 html5lib: 1.0.1 sqlalchemy: 1.2.10 pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
mroeschkecommented, Jun 27, 2021

Thanks for verifying @felixDulys! Yes, appears you’re right that this is already tested and that we can close this issue.

Happy for you to help on other good first issue + needs tests open issues!

0reactions
felixDulyscommented, Jun 27, 2021

I am new to contributing to the pandas codebase, so please excuse any noob mistakes 😃

Conclusion: It looks to me like this combination is already being tested here.

Thought process on this: In pandas/tests/frame/test_reductions.py::TestDataFrameAnalytics::test_stat_op_calc, we are able to test several reductions directly on a DataFrame with various na handling techniques when calling on assert_stat_op_calc() (here). We can specify sum as the operation to test (opname) and the flag has_skipna (default True) assures that we test this functionality when skipna=False(here).

I propose we close this issue since it appears to work in master (verified +1) and it already has test coverage. Happy to still work on this if I am misunderstanding how this works and coverage is still needed!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Multiple aggregations of the same column using pandas ...
Pandas provides the pandas.NamedAgg namedtuple with the fields ['column', 'aggfunc'] to make it clearer what the arguments are. As usual, the ...
Read more >
Reshaping and Pivot Tables — pandas 0.15.2 documentation
If the values argument is omitted, and the input DataFrame has more than one column of values which are not used as column...
Read more >
Pandas Grouper and Agg Functions Explained
Explanation of panda's grouper and aggregation (agg) functions.
Read more >
Pandas DataFrame apply() Examples - DigitalOcean
Let's say we want to apply a function that accepts more than one parameter. In that case, we can pass the additional parameters...
Read more >
All Pandas groupby() You Should Know for Grouping Data ...
With agg() method. There is a method called agg() and it allows us to specify multiple aggregation functions at once. df.groupby(' ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found