question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problem with DataFrame.diff() when using groupby getting "unexpected keyword argument 'axis'" due to built-in wrapper

See original GitHub issue

Code Sample, a copy-pastable example if possible

data = pd.read_stata("myfile.dta")
data = data.set_index(['country', 'year'])
data_delta = data.groupby('count').diff()

Problem description

Hi everyone! My first bug report 😃

I’m having some problems with the .diff() argument, and first thought I was just being an idiot, but now I’m fairly confident I’ve isolated the bug.

Note, when I run this manually line-by-line it works fine, but I depend on this being inside a function (because I remove some columns before doing the differences and then reinstate them in a highly repetitive fashion).

For a long time I was on pandas 0.18.x and was using the following command fine:

data = data.groupby('country).diff().shift(-1)

But after upgrading to pandas 0.20.1, the behavior of diff seems to have changed, and now takes a periods argument, which is very useful to me! Now, the problem is I get thrown a error everytime I use it. The traceback looks like this:

Traceback (most recent call last):
  File "/Users/myname/anaconda/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-5e1d634b8803>", line 1, in <module>
    dat = feature_expand(data_everything, lags=2, lag_y=True, delta=True)
  File "<ipython-input-3-184a59b406db>", line 126, in feature_expand
    data_delta = data_delta.diff()
  File "<string>", line 21, in diff
  File "/Users/myname/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 612, in wrapper
    *args, **kwargs)
  File "/Users/myname/anaconda/lib/python2.7/site-packages/pandas/core/groupby.py", line 3481, in _aggregate_item_by_item
    raise errors
TypeError: diff() got an unexpected keyword argument 'axis'

Following the traceback I find a wrapper function in groupby.py under, _GroupBy._make_wrapper().wrapper, which says it does some “trickery for aggregation functions that need an axis”, and seems to add the axis keyword argument by itself. This has probably been useful behaviour previously, but now it breaks .diff() as it doesn’t take an axis argument anymore.

I hope someone has time to help me and the community with this.

Cheers 😃

Expected Output

A dataframe of country-level first differences.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 2.7.13.final.0 python-bits: 64 OS: Darwin OS-release: 16.7.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: None LOCALE: None.None pandas: 0.20.1 pytest: 2.9.2 pip: 9.0.1 setuptools: 35.0.2 Cython: 0.24.1 numpy: 1.13.1 scipy: 0.19.1 xarray: None IPython: 5.1.0 sphinx: 1.4.6 patsy: 0.4.1 dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: 1.1.0 tables: 3.2.3.1 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.3.2 xlrd: 1.0.0 xlwt: 1.1.2 xlsxwriter: 0.9.3 lxml: 3.7.3 bs4: 4.5.3 html5lib: 0.9999999 sqlalchemy: 1.1.9 pymysql: 0.7.9.None psycopg2: 2.7.1 (dt dec pq3 ext lo64) jinja2: 2.8 s3fs: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
pratapvardhancommented, Aug 26, 2017

Suspect, it’s related to #14773

0reactions
jrebackcommented, Aug 28, 2017

closing as duplicate

Read more comments on GitHub >

github_iconTop Results From Across the Web

value_counts not working in groupby apply - Stack Overflow
In this case the method apply has a parameter axis . In the second case you have a different method apply of DataFrameGroupBy...
Read more >
Group by: split-apply-combine — pandas 1.5.2 documentation
Splitting an object into groups#. pandas objects can be split on any of their axes. The abstract definition of grouping is to provide...
Read more >
What's New - Xarray
Fixed bug where Dataset.coarsen.construct() would demote non-dimension coordinates to variables. (PR7233) By Tom Nicholas. Raise a TypeError when trying ...
Read more >
<lambda>() got an unexpected keyword argument 'axis' - Reddit
It seems like in df.apply(lambda x: 1, axis=1) axis=1 is parsed as an argument to the pandas apply() method, but in ...
Read more >
FAQ: How to do a minimal reproducible example ( reprex ) for ...
You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found