question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

REGR: DataFrame.mean(numeric_only=True) raises AttributeError on v1.0.3

See original GitHub issue

Code Sample, a copy-pastable example if possible

>>> import numpy as np
>>>
>>> import pandas as pd
>>>
>>> pd.__version__
'1.0.3'
>>>
>>> df_wide = pd.DataFrame(np.random.randint(1000, size=(1000, 100))).astype("Int64").copy()
>>>
>>> df_wide.mean(numeric_only=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\simon\pandas\pandas\core\generic.py", line 11215, in stat_func
    f, name, axis=axis, skipna=skipna, numeric_only=numeric_only
  File "C:\Users\simon\pandas\pandas\core\frame.py", line 7896, in _reduce
    res = df._data.reduce(op, axis=1, skipna=skipna, **kwds)
  File "C:\Users\simon\pandas\pandas\core\internals\managers.py", line 351, in reduce
    bres = func(blk.values, *args, **kwargs)
  File "C:\Users\simon\pandas\pandas\core\nanops.py", line 69, in _f
    return f(*args, **kwargs)
  File "C:\Users\simon\pandas\pandas\core\nanops.py", line 102, in f
    if values.size == 0 and kwds.get("min_count") is None:
AttributeError: 'IntegerArray' object has no attribute 'size'
>>>

Problem description

This is a regression from 0.25.3

0aa48f7d9269206dea492ed14d5dfc6f46468de3 is the first bad commit commit 0aa48f7d9269206dea492ed14d5dfc6f46468de3 Author: jbrockmendel jbrockmendel@gmail.com Date: Wed Jan 1 09:18:20 2020 -0800

PERF: perform reductions block-wise (#29847)

on master raises AttributeError: 'int' object has no attribute 'dtype'

>>> df_wide.mean(numeric_only=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\simon\pandas\pandas\core\generic.py", line 11114, in stat_func
    func, name=name, axis=axis, skipna=skipna, numeric_only=numeric_only
  File "C:\Users\simon\pandas\pandas\core\frame.py", line 7990, in _reduce
    res = df._data.reduce(blk_func)
  File "C:\Users\simon\pandas\pandas\core\internals\managers.py", line 362, in reduce
    bres = func(blk.values, *args, **kwargs)
  File "C:\Users\simon\pandas\pandas\core\frame.py", line 7985, in blk_func
    return op(values, axis=0, skipna=skipna, **kwds)
  File "C:\Users\simon\pandas\pandas\core\nanops.py", line 120, in f
    result = bn_func(values, axis=axis, **kwds)
  File "<__array_function__ internals>", line 6, in nanmean
  File "C:\Users\simon\Anaconda3\envs\pandas-dev\lib\site-packages\numpy\lib\nanfunctions.py", line 952, in nanmean
    avg = _divide_by_count(tot, cnt, out=out)
  File "C:\Users\simon\Anaconda3\envs\pandas-dev\lib\site-packages\numpy\lib\nanfunctions.py", line 219, in _divide_by_count
    return a.dtype.type(a / b)
AttributeError: 'int' object has no attribute 'dtype'

Expected Output

>>> import numpy as np
>>>
>>> import pandas as pd
>>>
>>> pd.__version__
'0.25.3'
>>>
>>> df_wide = pd.DataFrame(np.random.randint(1000, size=(1000, 100))).astype("Int64").copy()
>>>
>>> df_wide.mean(numeric_only=True)
0     520.057
1     507.735
2     501.618
3     506.590
4     501.500
       ...
95    507.594
96    483.273
97    506.330
98    497.068
99    508.118
Length: 100, dtype: float64
>>>

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
jbrockmendelcommented, Apr 3, 2020

@simonjayhawkins can you @ me on these for issues that i caused

0reactions
simonjayhawkinscommented, Apr 24, 2020

changing the milestone, xref #33300 to track.

Read more comments on GitHub >

github_iconTop Results From Across the Web

pandas.DataFrame.mean — pandas 1.5.2 documentation
Return the mean of the values over the requested axis. Parameters. axis{index (0), columns (1)}. Axis for the function to be applied on....
Read more >
Can't get attribute 'new_block' on <module 'pandas.core ...
1, After doing some search online, the AttributeError in pyspark seems to be caused by mismatched pandas versions between driver and workers? 2, ......
Read more >
pandas.DataFrame.mean() Examples
DataFrame.mean() function is used to get the mean of the values over the requested axis in pandas. This by default returns a Series, ......
Read more >
How to Fix: module 'pandas' has no attribute 'dataframe'
One error you may encounter when using pandas is: AttributeError: module 'pandas' has no attribute 'dataframe'.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found