question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

v1.23.0: np.unique returns incorrect result for float16 array containing NaNs

See original GitHub issue

This only seems to affect float16 arrays (float32 and float64 are fine), and only if the array contains NaN values.

Reproduction:

import numpy as np
x = np.array([0, 1, np.nan], dtype='float16')
print(np.__version__)
print(x)
print(np.unique(x))

Output in numpy 1.22.4:

1.22.4
[ 0.  1. nan]
[ 0.  1. nan]

Output in numpy 1.23.0

1.23.0
[ 0.  1. nan]
[0.]

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
postmalloccommented, Jun 24, 2022

@seberg I added some test cases specifically for float16 and float32; includes the test case you shared earlier.

The underlying issue seems that np.searchsorted is broken with NaN for float16, I am not quite sure why… Maybe it happened as part of the C++ conversions?\

EDIT:

x = np.array([0, 1, np.nan], dtype='float16')
np.searchsorted(x, x[-1])
# Should return 2 but returns 0
1reaction
postmalloccommented, Jun 23, 2022

I’ve taken a quick look. It seems the type_num for np.float16 is 23 according to the NPY_TYPES enum sequence. However, the taglist in binarysearch has npy::half_tag at a different position (at index 11). The indices don’t match, and it never enters HALF_LT as you observed.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unexpected numpy.unique behavior - python - Stack Overflow
I am using numpy.unique to get values, indices and counts on a masked array that has been flattened with numpy.ravel and am getting...
Read more >
Data types — NumPy v1.24 Manual
NumPy supports a much greater variety of numerical types than Python does. This section shows which are available, and how to modify an...
Read more >
What's New — pandas 0.20.3 documentation
Bug in Float64Index causing an empty array instead of None to be returned from .get(np.nan) on a Series whose index did not contain...
Read more >
Chapter 4. NumPy Basics: Arrays and Vectorized Computation
Here are some of the things it provides: ndarray , a fast and space-efficient multidimensional array providing vectorized arithmetic operations and ...
Read more >
Remove rows/columns with missing value (NaN) in ndarray
To remove rows and columns containing missing values NaN in NumPy array numpy.ndarray, check NaN with np.isnan() and extract rows and ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found