Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DOC: Improve the description of the `dtype` parameter in `numpy.array` docstring

See original GitHub issue

The description of the dtype parameter in numpy.array docstring looks as follows:

dtype : data-type, optional

The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the .astype(t) method.

which is rather misleading, because in reality the behavior is different. For example for integers the type chain on Windows is (int32 -> int64 -> uint64 -> object), generally it starts with np.int_. Also, for example, there is no np.uint32 in this chain, and also there is no np.int8 and etc.

One more thought, this behavior is also inconsistent with the following snippet:

>>> (np.array(1, np.int32) + np.array(1, np.uint64)).dtype
dtype('float64')  #instead of dtype('O')

Thus in np.array numpy follows Python rules and in expression it follows C’s rules. But it also follows deliberately:

>>> (np.array([0x7fffffff]) + np.array([0x7fffffff])).dtype
dtype('int32')    #instead of dtype('uint32')

Another moment that by default the dtype will be numpy.float64 instead of np.int_ which contradicts with the description. Related issue is #10405.

I don’t know what is the right way to resolve this tangle. I opened this issue here because it seems that there is no interest in discussing this on numpy’s ML.

p.s.: one more point, is there any real benefit to define np.int_ as C’s long (instead of 8 byte on 64-bit and 4 byte on 32-bit), with the following differences on Windows 64-bit and other OSs. As for me, I’ve never did meet the advantages of this choice, only a few inconveniences, and the need to provide dtype= everywhere.

Issue Analytics

State:
Created 6 years ago
Comments:17 (15 by maintainers)

Top GitHub Comments

1reaction

mattipcommented, Jun 17, 2020

The documentation of the dtype kwarg still states

The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.

I wish we could write “If not given, then the type will be determined according to internal arbitrary rules that may change between versions.” In practice it would be good to link to a part of a NEP or other documentation that describes the algorithm used.

1reaction

eric-wiesercommented, Feb 16, 2018

This argument can only be used to ‘upcast’ the array. For downcasting, use the .astype(t) method.

This is plain wrong:

>>> a = np.array(256, np.uint16)
>>> np.array(a, dtype=np.uint8) # definitely not a downcast... /s
array(0, dtype=uint8)

Top Results From Across the Web

Docstring format for input numpy arrays with mandatory ...

Dimension and item types are extra information about your arrays which are function arguments. Thus based on documentation you need a style ...

Style guide — numpydoc v1.6.0rc1.dev0 Manual

The docstring is a special attribute of the object ( object. __doc__ ) and, for consistency, is surrounded by triple double quotes, i.e.:...

NumPy: the absolute basics for beginners

An array is a grid of values and it contains information about the raw data, ... The elements are all of the same...

1.4.1. The NumPy array object - Scipy Lecture Notes

NumPy Reference documentation¶ · Interactive help: In [5]: np.array? String Form:<built-in function array>. Docstring: array(object, dtype=None, copy=True, order ...

SciPy 1.9.0 Release Notes — SciPy v1.9.3 Manual

Add a dtype parameter to scipy.sparce.csgraph.laplacian for type casting. ... Removed docstring related functions from scipy.misc ( docformat ...