question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error for Int64Dtype column: "The 'reduce' method is not supported."

See original GitHub issue

Thank you for creating this very useful library!

Describe the bug

I experience an error when using Pandas Profiling with a data frame containing a Int64Dtype() column with at least 5 rows.

To Reproduce

Create a file example.py with this code:

"""
Test for issue 502:
https://github.com/pandas-profiling/pandas-profiling/issues/502
"""
import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame({
    'a': [1, 2, 3, 4, 5]
}, dtype=pd.Int64Dtype())

profile = ProfileReport(df, title='Pandas Profiling Report', explorative=True)
profile.to_file("your_report.html")

Run from the command line with python example.py. Output:

Summarize dataset:   0%|                                 | 0/15 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "example.py", line 10, in <module>
    profile.to_file("your_report.html")
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 245, in to_file
    data = self.to_html()
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 348, in to_html
    return self.html
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 168, in html
    self._html = self._render_html()
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 275, in _render_html
    report = self.report
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 162, in report
    self._report = get_report_structure(self.description_set)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/profile_report.py", line 143, in description_set
    self._description_set = describe_df(self.title, self.df)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/describe.py", line 63, in describe
    series_description = get_series_descriptions(df, pbar)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 473, in get_series_descriptions
    executor.imap_unordered(multiprocess_1d, args)
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 735, in next
    raise value
  File "/usr/lib/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 450, in multiprocess_1d
    return column, describe_1d(series)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 419, in describe_1d
    type_to_func[series_description["type"]](series, series_description)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas_profiling/model/summary.py", line 151, in describe_numeric_1d
    "min": np.min(present_values),
  File "<__array_function__ internals>", line 6, in amin
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 2831, in amin
    keepdims=keepdims, initial=initial, where=where)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 87, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
  File "/home/ac/projects/python/profiling/env/lib/python3.6/site-packages/pandas/core/arrays/integer.py", line 372, in __array_ufunc__
    raise NotImplementedError("The 'reduce' method is not supported.")
NotImplementedError: The 'reduce' method is not supported.

Version information:

  • Python version: Python 3.6.9 (default, Apr 18 2020, 01:56:04)
  • Environment: Where do you run the code? Command line
  • pip: If you are using pip, run pip freeze in your environment and report the results. The list of packages can be rather long, you can use the snippet below to collapse the output.
Click to expand Version information

astropy==4.0.1.post1
attrs==19.3.0
backcall==0.2.0
bleach==3.1.5
certifi==2020.6.20
chardet==3.0.4
confuse==1.1.0
cycler==0.10.0
decorator==4.4.2
defusedxml==0.6.0
entrypoints==0.3
htmlmin==0.1.12
idna==2.9
ImageHash==4.1.0
importlib-metadata==1.6.1
ipykernel==5.3.0
ipython==7.15.0
ipython-genutils==0.2.0
ipywidgets==7.5.1
jedi==0.17.1
Jinja2==2.11.2
joblib==0.15.1
jsonschema==3.2.0
jupyter-client==6.1.3
jupyter-core==4.6.3
kiwisolver==1.2.0
llvmlite==0.33.0
MarkupSafe==1.1.1
matplotlib==3.2.2
missingno==0.4.2
mistune==0.8.4
nbconvert==5.6.1
nbformat==5.0.7
networkx==2.4
notebook==6.0.3
numba==0.50.0
numpy==1.19.0
packaging==20.4
pandas==1.0.5
pandas-profiling==2.8.0
pandocfilters==1.4.2
parso==0.7.0
pexpect==4.8.0
phik==0.10.0
pickleshare==0.7.5
Pillow==7.1.2
pkg-resources==0.0.0
prometheus-client==0.8.0
prompt-toolkit==3.0.5
ptyprocess==0.6.0
Pygments==2.6.1
pyparsing==2.4.7
pyrsistent==0.16.0
python-dateutil==2.8.1
pytz==2020.1
PyWavelets==1.1.1
PyYAML==5.3.1
pyzmq==19.0.1
requests==2.24.0
scipy==1.5.0
seaborn==0.10.1
Send2Trash==1.5.0
six==1.15.0
tangled-up-in-unicode==0.0.6
terminado==0.8.3
testpath==0.4.4
tornado==6.0.4
tqdm==4.46.1
traitlets==4.3.3
urllib3==1.25.9
visions==0.4.4
wcwidth==0.2.5
webencodings==0.5.1
widgetsnbextension==3.5.1
zipp==3.1.0

Additional context

Having at least 5 rows seems to be required for the error to occur. This code will run WITHOUT ERROR:

import numpy as np
import pandas as pd
from pandas_profiling import ProfileReport

df = pd.DataFrame({
    'a': [1, 2, 3, 4] # no 5
}, dtype=pd.Int64Dtype())

profile = ProfileReport(df, title='Pandas Profiling Report', explorative=True)
profile.to_file("your_report.html")

Thank you!

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:5
  • Comments:5

github_iconTop GitHub Comments

1reaction
andycraigcommented, Jul 16, 2020

I installed it in via pip using that command and tried running the code from the bug report. It ran without error and generated the HTML output as expected. I believe the issue is solved now.

Thank you very much!

1reaction
sbrugmancommented, Jun 27, 2020

Seems that this error was introduced by the changes for the enhanced performance of summarization of numeric series. I’ve pushed a workaround to revert this for pandas’ nullable integers. Will be in the next release.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Error for Int64Dtype column: "The 'reduce' method is not ...
I experience an error when using Pandas Profiling with a data frame containing a Int64Dtype() column with at least 5 rows. To Reproduce....
Read more >
Downcasting columns with nullable integers in pandas ...
How can one downcast columns with nullable integers in pandas DataFrames? ... NotImplementedError: The 'reduce' method is not supported.
Read more >
What's new in 2.0.0 (??) - Pandas
Improved error message for merge_asof() when join-columns were duplicated (GH50102) ... Construction with datetime64 or timedelta64 dtype with unsupported ...
Read more >
Changing Data Type in Pandas - Ritchie Ng
Method 1: Change datatype after reading the csv ... have been following and copy pasting the code but I am not sure why...
Read more >
ray.air.util.tensor_extensions.pandas — Ray 3.0.0.dev0
__iter__() # - Added support for column casts to extension types. ... 2, 2, 2), dtype=int64) dtype: object >>> # Pandas is now...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found