question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: .nlargest with unsigned integers

See original GitHub issue

Code Sample, a copy-pastable example if possible

pd.Series(np.array([0, 0, 0, 100, 1000, 10000, 100], dtype='uint32')).nlargest(5) 
0        0
1        0
2        0
5    10000
4     1000

Problem description

nlargest favours 0 above positive values. Common to both uint32 and uint64 types and possibly others.

Expected Output

5    10000
4     1000
3      100
6      100
0        0
dtype: uint32

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None python: 3.6.2.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-327.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_GB.UTF-8 LOCALE: en_GB.UTF-8

pandas: 0.20.3 pytest: 3.2.1 pip: 9.0.1 setuptools: 36.4.0 Cython: None numpy: 1.13.1 scipy: 0.19.1 xarray: None IPython: 6.1.0 sphinx: None patsy: 0.4.1 dateutil: 2.6.1 pytz: 2017.2 blosc: None bottleneck: None tables: 3.4.2 numexpr: 2.6.2 feather: None matplotlib: 2.0.2 openpyxl: 2.4.8 xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.9999999 sqlalchemy: 1.1.13 pymysql: 0.7.9.None psycopg2: None jinja2: 2.9.6 s3fs: None pandas_gbq: None pandas_datareader: None

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Reactions:1
  • Comments:13 (13 by maintainers)

github_iconTop GitHub Comments

1reaction
jschendelcommented, Jun 11, 2018

I suspect the issue is with this block of code: https://github.com/pandas-dev/pandas/blob/480790531ffcc4329f280ddf6877d028d08e969f/pandas/core/algorithms.py#L1136-L1138

Specifically for uint data, I don’t think -arr behaves as intended:

In [2]: arr = np.array([0, 0, 0, 100, 1000, 10000, 100], dtype='uint64')

In [3]: -arr
Out[3]:
array([                   0,                    0,                    0,
       18446744073709551516, 18446744073709550616, 18446744073709541616,
       18446744073709551516], dtype=uint64)
0reactions
gfyoungcommented, Jun 11, 2018

Sigh…that’s symptomatic of the same overflow issue presented with uint. Good catch!

Read more comments on GitHub >

github_iconTop Results From Across the Web

Why are unsigned integers error prone? - Stack Overflow
Bjarne Stroustrup says that unsigned ints are error prone and lead to bugs. So, you should only use them when you really need...
Read more >
4.5 — Unsigned integers, and why to avoid them - Learn C++
Unsigned integers are integers that can only hold non-negative whole ... it's not possible to go past level 22 due to an overflow...
Read more >
Underflow bug - Spotify Engineering
With unsigned integers, we can get values ranging from 0 to 255 (2 8 − 1). For signed integers, the range is instead...
Read more >
CWE-195: Signed to Unsigned Conversion Error (4.9) - MITRE
Conversion between signed and unsigned values can lead to a variety of errors, but from a security standpoint is most commonly associated with...
Read more >
Almost Always Unsigned
The most typical argument against the use of unsigned integers is that it's more error prone since it's far easier for an expression...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found