BUG: .nlargest with unsigned integers
See original GitHub issueCode Sample, a copy-pastable example if possible
pd.Series(np.array([0, 0, 0, 100, 1000, 10000, 100], dtype='uint32')).nlargest(5)
0 0
1 0
2 0
5 10000
4 1000
Problem description
nlargest
favours 0
above positive values. Common to both uint32
and uint64
types and possibly others.
Expected Output
5 10000
4 1000
3 100
6 100
0 0
dtype: uint32
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-327.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8
pandas: 0.20.3 pytest: 3.2.1 pip: 9.0.1 setuptools: 36.4.0 Cython: None numpy: 1.13.1 scipy: 0.19.1 xarray: None IPython: 6.1.0 sphinx: None patsy: 0.4.1 dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.8 xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.9999999 sqlalchemy: 1.1.13 pymysql: 0.7.9.None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None
Issue Analytics
- State:
- Created 5 years ago
- Reactions:1
- Comments:13 (13 by maintainers)
I suspect the issue is with this block of code: https://github.com/pandas-dev/pandas/blob/480790531ffcc4329f280ddf6877d028d08e969f/pandas/core/algorithms.py#L1136-L1138
Specifically for uint data, I don’t think
-arr
behaves as intended:Sigh…that’s symptomatic of the same overflow issue presented with
uint
. Good catch!