question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: Spearman correlation is broken (dtype mismatch) on 32-bit platforms

See original GitHub issue

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd
d = DataFrame([1.0, 2.0])
d.corr(method='spearman')

Issue Description

Calling the corr method of a DataFrame with method='spearman' produces a ValueError due to a buffer dtype mismatch on 32-bit platforms.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/site-packages/pandas/core/frame.py", line 9376, in corr
    correl = libalgos.nancorr_spearman(mat, minp=min_periods)
  File "pandas/_libs/algos.pyx", line 415, in pandas._libs.algos.nancorr_spearman
  File "pandas/_libs/algos.pyx", line 938, in pandas._libs.algos.rank_1d
ValueError: Buffer dtype mismatch, expected 'const intp_t' but got 'long long'

If I have some time, I’ll look into this further and try to offer a PR. The problem was discovered due to a failing test in pingouin (https://github.com/raphaelvallat/pingouin/issues/197).

I have reproduced this on both 32-bit x86 and 32-bit ARM. While my “installed versions” are those currently in Fedora Rawhide, including Pandas 1.3.0, I did build an RPM for Pandas 1.3.3 and reproduce with that too.

Expected Behavior

     0
0  1.0

Installed Versions

INSTALLED VERSIONS

commit : f00ed8f47020034e752baf0250483053340971b0 python : 3.10.0.candidate.2 python-bits : 32 OS : Linux OS-release : 5.13.14-200.fc34.x86_64 Version : #1 SMP Fri Sep 3 15:33:01 UTC 2021 machine : armv7l processor : armv7l byteorder : little LC_ALL : None LANG : C.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.3.0 numpy : 1.21.1 pytz : 2021.1 dateutil : 2.8.1 pip : None setuptools : 57.4.0 Cython : None pytest : None hypothesis : None sphinx : None blosc : None feather : None xlsxwriter : None lxml.etree : 4.6.3 html5lib : None pymysql : None psycopg2 : None jinja2 : None IPython : None pandas_datareader: None bs4 : None bottleneck : 1.3.2 fsspec : None fastparquet : None gcsfs : None matplotlib : 3.5.0b1 numexpr : 2.7.1 odfpy : None openpyxl : None pandas_gbq : None pyarrow : None pyxlsb : None s3fs : None scipy : 1.7.0 sqlalchemy : None tables : 3.6.1 tabulate : None xarray : None xlrd : None xlwt : None numba : None

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

1reaction
musicinmybraincommented, Sep 15, 2021

Thanks! I appreciate your efforts.

Once a fix is available, I’ll work with the maintainers of the python-pandas package in Fedora Linux to try to make sure it is part of the upcoming Fedora 35 release. The current Fedora 34 release has pandas 1.2.5, which (I’ve verified) predates the regression.

1reaction
musicinmybraincommented, Sep 15, 2021

Current master (5872bfe1c713ddb27b337e5b6549bf497e44834b) works as expected on 32-bit x86.

Read more comments on GitHub >

github_iconTop Results From Across the Web

What's new in 1.3.0 (July 2, 2021) - Pandas
These are bug fixes that might have notable behavior changes. Categorical.unique now always maintains same dtype as original#. Previously, when calling ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found