Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: np.str_ and np.bytes_ object have a confusing repr and str

See original GitHub issue

>>> s = np.str_("\0\0")
>>> list(s)
['\0', '\0']
>>> len(s)
2
>>> s
''  # wat

>>> b = np.bytes_(2)
>>> list(b)
[0, 0]
>>> len(b)
2
>>> b
b''  # wat

Caused by

https://github.com/numpy/numpy/blob/f15026ce782f3b8c291cde5363b86f4f1bc88ad8/numpy/core/src/multiarray/scalartypes.c.src#L389-L425

It looks to me like we should just remove our overloads of these functions, they don’t make any sense.

Issue Analytics

State:
Created 4 years ago
Comments:9 (9 by maintainers)

Top GitHub Comments

1reaction

sebergcommented, Feb 1, 2020

True, although that code will probably give the warning (although if you have many of that you would not notice). Its true from a subclassing perspective… It seems we either have to sacrifice consistent subclassing or consistency with arrays? Since this is NumPy scalar, which is almost an array, I think I still lean a bit towards 3. But basically there is no good solution and either 3. or 4. seem better than the current state to me.

1reaction

sebergcommented, Feb 1, 2020

I am for Option 3.

Any option should have the same result as np.array(string, dtype="S")[()] and np.array(str, "S") and np.string_(str) should be as similar as possible (at the moment at least, due to PyArray_Return).

Since it can be a bug for someone who needs the zero bytes, I think a warning makes sense. The only question is whether the warning for some reason is prohibitively noisy. I do not expect that. It could point out that void can sometimes be a way to spell bytes of fixed length.