question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

pivot_table throws exception with multiple aggregations per column and margins=True

See original GitHub issue

When applying multiple aggregations to columns and setting margins=True I receive a KeyError. I believe that because multiple aggregations are applied the columns become a MultiIndex, which is unexpected when computing margins.

>>> import pandas as pd
>>> import numpy as np
>>> import random


>>> df = pd.DataFrame({'random1': [random.random() for i in range(10)],
                       'random2': [random.random() for i in range(10)],
                       'type': ['duck', 'bird']*5},
                index=range(10,20))

>>> df.pivot_table(index='type', 
                   aggfunc={'random1': [np.median, np.mean], 
                            'random2': np.sum}, 
                   margins=True)

KeyError                                  Traceback (most recent call last)
<ipython-input> in <module>()
      3                aggfunc={'random1': [np.median, np.mean], 
      4                         'random2': np.sum}, 
----> 5                margins=True)

/pandas/util/decorators.pyc in wrapper(*args, **kwargs)
     86                 else:
     87                     kwargs[new_arg_name] = new_arg_value
---> 88             return func(*args, **kwargs)
     89         return wrapper
     90     return _deprecate_kwarg

/pandas/util/decorators.pyc in wrapper(*args, **kwargs)
     86                 else:
     87                     kwargs[new_arg_name] = new_arg_value
---> 88             return func(*args, **kwargs)
     89         return wrapper
     90     return _deprecate_kwarg

/pandas/tools/pivot.pyc in pivot_table(data, values, index, columns, aggfunc, fill_value, margins, dropna)
    145     if margins:
    146         table = _add_margins(table, data, values, rows=index,
--> 147                              cols=columns, aggfunc=aggfunc)
    148 
    149     # discard the top level

/pandas/tools/pivot.pyc in _add_margins(table, data, values, rows, cols, aggfunc)
    189             row_margin[k] = grand_margin[k]
    190         else:
--> 191             row_margin[k] = grand_margin[k[0]]
    192 
    193     margin_dummy = DataFrame(row_margin, columns=[key]).T

KeyError: 'random1'

The exact same error occurs with

aggfunc={'random1': {'median': np.median, 'mean': np.mean}, 
         'random2': np.sum}

These errors do not occur when margins=False This is using pandas version 0.15.2. I have not seen any changes which would lead me to believe it is be fixed in master.

Happy to create a PR, although I haven’t done one before.

Issue Analytics

  • State:open
  • Created 8 years ago
  • Reactions:4
  • Comments:14 (3 by maintainers)

github_iconTop GitHub Comments

2reactions
stefansimikcommented, May 5, 2019

Hi @tmo, If you could check into this, it would be really great. I would like to do it, but is is above my experience level to fix this in pandas codebase.

2reactions
lukaszkiszkacommented, Feb 12, 2019

@tmo go ahead - I was going to look on this in future - in a few weeks. If you will found resolution I will be glad when you will add me as reviewer.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Pandas pivot_table multiple aggfunc with margins
I see the error you are talking about. I got around it by using the function calls instead of the string names "count","mean",...
Read more >
Pivot Tables in Pandas with Python - Datagy
In this post, you'll learn how to create pivot tables in Python ... Now, imagine you wanted to calculate different aggregations per column....
Read more >
Aggregations in Power Pivot - Microsoft Support
Aggregations are a way of collapsing, summarizing, or grouping data. ... A new feature in Power Pivot is the ability to apply filters...
Read more >
How to Pivot and Plot Data With Pandas - Open Data Science
Unique values in the unique_carrier_name column should be used as our column labels (the columns argument); The values used for the aggregation ......
Read more >
Pandas pivot table for excel users - SettingBox
Different aggregation functions for different columns#. Once we start analyzing our data we may need to perform multiple calculations as per the ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found