BUG: DataFrame.agg - why numpy.size doesn't work?
See original GitHub issue-
[ x] I have checked that this issue has not already been reported.
-
[ x] I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
df = pd.DataFrame([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
columns=['A', 'B', 'C'])
df.agg({'A':['mean','std','size']})
import numpy as np
#Somehow this just doesn't work with DF.agg but works with DFGroupby.agg
df.agg({'A':[np.mean,np.std,np.size]})
Problem description
Intuitively, I assumed df.agg({‘A’:[np.mean,np.std,np.size]}) should work as df.agg({‘A’:[‘mean’,‘std’,‘size’]}) does, but it doesn’t. I wonder why? Looked through docs like the below but still didn’t get it:
- https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.agg.html
- https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.core.groupby.DataFrameGroupBy.agg.html
Expected Output
<html> <body>A
4.0 3.0 4.0
</body> </html> ####Output of *df.agg({'A':[np.mean,np.std,np.size]})
TypeError Traceback (most recent call last) ~\anaconda3\lib\site-packages\pandas\core\base.py in _aggregate_multiple_funcs(self, arg, _axis) 553 try: –> 554 return concat(results, keys=keys, axis=1, sort=False) 555 except TypeError:
~\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy) 280 copy=copy, –> 281 sort=sort, 282 )
~\anaconda3\lib\site-packages\pandas\core\reshape\concat.py in init(self, objs, axis, join, keys, levels, names, ignore_index, verify_integrity, copy, sort) 356 ) –> 357 raise TypeError(msg) 358
TypeError: cannot concatenate object of type ‘<class ‘float’>’; only Series and DataFrame objs are valid
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last) <ipython-input-39-051b5cf01f85> in <module> 1 import numpy as np ----> 2 df.agg({‘A’:[np.mean,np.std,np.size]})
~\anaconda3\lib\site-packages\pandas\core\frame.py in aggregate(self, func, axis, *args, **kwargs) 6704 result = None 6705 try: -> 6706 result, how = self._aggregate(func, axis=axis, *args, **kwargs) 6707 except TypeError: 6708 pass
~\anaconda3\lib\site-packages\pandas\core\frame.py in _aggregate(self, arg, axis, *args, **kwargs) 6718 result = result.T if result is not None else result 6719 return result, how -> 6720 return super()._aggregate(arg, *args, **kwargs) 6721 6722 agg = aggregate
~\anaconda3\lib\site-packages\pandas\core\base.py in _aggregate(self, arg, *args, **kwargs) 426 427 try: –> 428 result = _agg(arg, _agg_1dim) 429 except SpecificationError: 430
~\anaconda3\lib\site-packages\pandas\core\base.py in _agg(arg, func) 393 result = {} 394 for fname, agg_how in arg.items(): –> 395 result[fname] = func(fname, agg_how) 396 return result 397
~\anaconda3\lib\site-packages\pandas\core\base.py in _agg_1dim(name, how, subset) 377 “nested dictionary is ambiguous in aggregation” 378 ) –> 379 return colg.aggregate(how) 380 381 def _agg_2dim(name, how):
~\anaconda3\lib\site-packages\pandas\core\series.py in aggregate(self, func, axis, *args, **kwargs) 3686 # Validate the axis parameter 3687 self._get_axis_number(axis) -> 3688 result, how = self._aggregate(func, *args, **kwargs) 3689 if result is None: 3690
~\anaconda3\lib\site-packages\pandas\core\base.py in _aggregate(self, arg, *args, **kwargs) 484 elif is_list_like(arg): 485 # we require a list, but not an ‘str’ –> 486 return self._aggregate_multiple_funcs(arg, _axis=_axis), None 487 else: 488 result = None
~\anaconda3\lib\site-packages\pandas\core\base.py in _aggregate_multiple_funcs(self, arg, _axis) 562 result = Series(results, index=keys, name=self.name) 563 if is_nested_object(result): –> 564 raise ValueError(“cannot combine transform and aggregation operations”) 565 return result 566
ValueError: cannot combine transform and aggregation operations
Issue Analytics
- State:
- Created 2 years ago
- Comments:7 (5 by maintainers)
Actually this works:
The only thing relevant to your issue is:
Wow, this looks serious. I have another example.
so
df.agg({'A':})
is more likedf.A.agg()
?It gets weirder