question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: `DataFrame.rank` & `Series.rank` results are inconsistent

See original GitHub issue
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.

Code Sample, a copy-pastable example

>>> import pandas as pd
>>> import numpy as np
>>> values = [-np.inf, 0, np.inf, np.nan, 2, np.nan]         
>>> s = pd.Series(values)
>>> s
0   -inf
1    0.0
2    inf
3    NaN
4    2.0
5    NaN
dtype: float64
>>> kwargs = {'method': 'dense', 'na_option': 'bottom', 'ascending': False, 'pct': False, 'numeric_only': False}
>>> s.rank(**kwargs)
0    4.0
1    3.0
2    1.0
3    5.0
4    2.0
5    5.0
dtype: float64
>>> pd.DataFrame({'a':s}).rank(**kwargs)
     a
0  4.0
1  3.0
2  1.0
3  4.0
4  2.0
5  4.0
>>> 

Problem description

The results being returned by Series.rank & DataFrame.rank seem to be inconsistent.

Expected Output

Output of DataFrame.rank must be returned correctly(as is being done by Series.rank)

Output of pd.show_versions()

INSTALLED VERSIONS

commit : 5f648bf1706dd75a9ca0d29f26eadfbb595fe52b python : 3.8.10.final.0 python-bits : 64 OS : Linux OS-release : 5.11.0-27-generic Version : #29~20.04.1-Ubuntu SMP Wed Aug 11 15:58:17 UTC 2021 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.3.2 numpy : 1.21.2 pytz : 2021.1 dateutil : 2.8.2 pip : 21.2.4 setuptools : 57.4.0 Cython : 0.29.24 pytest : 6.2.4 hypothesis : 6.17.2 sphinx : 4.1.2 blosc : None feather : None xlsxwriter : None lxml.etree : None html5lib : None pymysql : None psycopg2 : None jinja2 : 3.0.1 IPython : 7.27.0 pandas_datareader: None bs4 : 4.9.3 bottleneck : None fsspec : 2021.07.0 fastparquet : None gcsfs : None matplotlib : None numexpr : None odfpy : None openpyxl : None pandas_gbq : None pyarrow : 5.0.0 pyxlsb : None s3fs : None scipy : None sqlalchemy : None tables : None tabulate : None xarray : None xlrd : None xlwt : None numba : 0.53.1

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
phoflcommented, Aug 30, 2021

But you are correct, this was fixed by #41931

1reaction
phoflcommented, Aug 30, 2021

This works on master

Read more comments on GitHub >

github_iconTop Results From Across the Web

Assigning rank 2 numpy array to pandas DataFrame column ...
python - Assigning rank 2 numpy array to pandas DataFrame column behaves inconsistently - Stack Overflow. Stack Overflow for Teams – Start ...
Read more >
What's new in 1.4.0 (January 22, 2022) - Pandas
Notable bug fixes# · Inconsistent date string parsing# · Ignoring dtypes in concat with empty or all-NA columns# · Null-values are no longer...
Read more >
pandas.DataFrame.rank — pandas 1.5.2 documentation
Rank of values within each group. Ties are assigned the mean of the ranks (by default) for the group.
Read more >
What's new in 1.5.0 (September 19, 2022) - Pandas
Deprecated Series.rank() returning an empty result when the dtype is non-numeric and numeric_only=True is provided; this will raise a TypeError in a future ......
Read more >
What's new in 0.24.0 (January 25, 2019) - Pandas
Store Interval and Period data in a Series or DataFrame ... DataFrame.corrwith() now supports Spearman's rank correlation, Kendall's tau as well as callable ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found