DOC: Improve the description of the `dtype` parameter in `numpy.array` docstring
See original GitHub issueThe description of the dtype parameter in numpy.array docstring looks as follows:
dtype : data-type, optional
The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence. This argument can only be used to ‘upcast’ the array. For downcasting, use the .astype(t) method.
which is rather misleading, because in reality the behavior is different. For example for integers the type chain on Windows is (int32 -> int64 -> uint64 -> object), generally it starts with np.int_. Also, for example, there is no np.uint32 in this chain, and also there is no np.int8 and etc.
One more thought, this behavior is also inconsistent with the following snippet:
>>> (np.array(1, np.int32) + np.array(1, np.uint64)).dtype
dtype('float64') #instead of dtype('O')
Thus in np.array numpy follows Python rules and in expression it follows C’s rules. But it also follows deliberately:
>>> (np.array([0x7fffffff]) + np.array([0x7fffffff])).dtype
dtype('int32') #instead of dtype('uint32')
Another moment that by default the dtype will be numpy.float64 instead of np.int_ which contradicts with the description. Related issue is #10405.
I don’t know what is the right way to resolve this tangle. I opened this issue here because it seems that there is no interest in discussing this on numpy’s ML.
p.s.: one more point, is there any real benefit to define np.int_ as C’s long (instead of 8 byte on 64-bit and 4 byte on 32-bit), with the following differences on Windows 64-bit and other OSs. As for me, I’ve never did meet the advantages of this choice, only a few inconveniences, and the need to provide dtype= everywhere.
Issue Analytics
- State:
- Created 6 years ago
- Comments:17 (15 by maintainers)

Top Related StackOverflow Question
The documentation of the dtype kwarg still states
I wish we could write “If not given, then the type will be determined according to internal arbitrary rules that may change between versions.” In practice it would be good to link to a part of a NEP or other documentation that describes the algorithm used.
This is plain wrong: