Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error for Int64Dtype column: "The 'reduce' method is not supported."

See original GitHub issue

Thank you for creating this very useful library!

Describe the bug

I experience an error when using Pandas Profiling with a data frame containing a Int64Dtype() column with at least 5 rows.

To Reproduce

Create a file example.py with this code:

"""
Test for issue 502:
https://github.com/pandas-profiling/pandas-profiling/issues/502
"""
import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame({
    'a': [1, 2, 3, 4, 5]
}, dtype=pd.Int64Dtype())

profile = ProfileReport(df, title='Pandas Profiling Report', explorative=True)
profile.to_file("your_report.html")

Run from the command line with python example.py. Output:

Summarize dataset:   0%|                                 | 0/15 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "example.py", line 10, in <module>
    profile.to_file("your_report.html")
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 245, in to_file
    data = self.to_html()
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 348, in to_html
    return self.html
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 168, in html
    self._html = self._render_html()
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 275, in _render_html
    report = self.report
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 162, in report
    self._report = get_report_structure(self.description_set)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 143, in description_set
    self._description_set = describe_df(self.title, self.df)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/describe.py", line 63, in describe
    series_description = get_series_descriptions(df, pbar)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 473, in get_series_descriptions
    executor.imap_unordered(multiprocess_1d, args)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 450, in multiprocess_1d
    return column, describe_1d(series)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 419, in describe_1d
    type_to_func[series_description["type"]](series, series_description)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 151, in describe_numeric_1d
    "min": np.min(present_values),
  File "<__array_function__ internals>", line 6, in amin
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2831, in amin
    keepdims=keepdims, initial=initial, where=where)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 87, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas/core/arrays/integer.py", line 372, in __array_ufunc__
    raise NotImplementedError("The 'reduce' method is not supported.")
NotImplementedError: The 'reduce' method is not supported.

Version information:

Python version: Python 3.6.9 (default, Apr 18 2020, 01:56:04)
Environment: Where do you run the code? Command line
pip: If you are using pip, run pip freeze in your environment and report the results. The list of packages can be rather long, you can use the snippet below to collapse the output.

Click to expand Version information

astropy==4.0.1.post1
attrs==19.3.0
backcall==0.2.0
bleach==3.1.5
certifi==2020.6.20
chardet==3.0.4
confuse==1.1.0
cycler==0.10.0
decorator==4.4.2
defusedxml==0.6.0
entrypoints==0.3
htmlmin==0.1.12
idna==2.9
ImageHash==4.1.0
importlib-metadata==1.6.1
ipykernel==5.3.0
ipython==7.15.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
jedi==0.17.1
Jinja2==2.11.2
joblib==0.15.1
jsonschema==3.2.0
jupyter-client==6.1.3
jupyter-core==4.6.3
kiwisolver==1.2.0
llvmlite==0.33.0
MarkupSafe==1.1.1
matplotlib==3.2.2
missingno==0.4.2
mistune==0.8.4
nbconvert==5.6.1
nbformat==5.0.7
networkx==2.4
notebook==6.0.3
numba==0.50.0
numpy==1.19.0
packaging==20.4
pandas==1.0.5
pandas-profiling==2.8.0
pandocfilters==1.4.2
parso==0.7.0
pexpect==4.8.0
phik==0.10.0
pickleshare==0.7.5
Pillow==7.1.2
pkg-resources==0.0.0
prometheus-client==0.8.0
prompt-toolkit==3.0.5
ptyprocess==0.6.0
Pygments==2.6.1
pyparsing==2.4.7
pyrsistent==0.16.0
python-dateutil==2.8.1
pytz==2020.1
PyWavelets==1.1.1
PyYAML==5.3.1
pyzmq==19.0.1
requests==2.24.0
scipy==1.5.0
seaborn==0.10.1
Send2Trash==1.5.0
six==1.15.0
tangled-up-in-unicode==0.0.6
terminado==0.8.3
testpath==0.4.4
tornado==6.0.4
tqdm==4.46.1
traitlets==4.3.3
urllib3==1.25.9
visions==0.4.4
wcwidth==0.2.5
webencodings==0.5.1
widgetsnbextension==3.5.1
zipp==3.1.0

Additional context

Having at least 5 rows seems to be required for the error to occur. This code will run WITHOUT ERROR:

import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame({
    'a': [1, 2, 3, 4] # no 5
}, dtype=pd.Int64Dtype())

profile = ProfileReport(df, title='Pandas Profiling Report', explorative=True)
profile.to_file("your_report.html")

Thank you!

Issue Analytics

State:
Created 3 years ago
Reactions:5
Comments:5

Top GitHub Comments

1reaction

andycraigcommented, Jul 16, 2020

I installed it in via pip using that command and tried running the code from the bug report. It ran without error and generated the HTML output as expected. I believe the issue is solved now.

Thank you very much!

1reaction

sbrugmancommented, Jun 27, 2020

Seems that this error was introduced by the changes for the enhanced performance of summarization of numeric series. I’ve pushed a workaround to revert this for pandas’ nullable integers. Will be in the next release.

Top Results From Across the Web

Error for Int64Dtype column: "The 'reduce' method is not ...

I experience an error when using Pandas Profiling with a data frame containing a Int64Dtype() column with at least 5 rows. To Reproduce....

Downcasting columns with nullable integers in pandas ...

How can one downcast columns with nullable integers in pandas DataFrames? ... NotImplementedError: The 'reduce' method is not supported.

What's new in 2.0.0 (??) - Pandas

Improved error message for merge_asof() when join-columns were duplicated (GH50102) ... Construction with datetime64 or timedelta64 dtype with unsupported ...

Changing Data Type in Pandas - Ritchie Ng

Method 1: Change datatype after reading the csv ... have been following and copy pasting the code but I am not sure why...

ray.air.util.tensor_extensions.pandas — Ray 3.0.0.dev0

__iter__() # - Added support for column casts to extension types. ... 2, 2, 2), dtype=int64) dtype: object >>> # Pandas is now...