Can't take max of arrays at least as large as 2 ** 32
See original GitHub issueDescribe the bug
Calling sparse.COO.max on an array larger than 2 ** 32 - 1 fails a TypeError like so:
>>> a.shape
(4294967296,)
>>> a.max()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\<path_redacted>\sparse\_sparse_array.py", line 444, in max
return np.maximum.reduce(self, out=out, axis=axis, keepdims=keepdims)
File "C:\<path_redacted>\sparse\_sparse_array.py", line 307, in __array_ufunc__
result = SparseArray._reduce(ufunc, *inputs, **kwargs)
File "C:\<path_redacted>\sparse\_sparse_array.py", line 278, in _reduce
return self.reduce(method, **kwargs)
File "C:\<path_redacted>\sparse\_sparse_array.py", line 360, in reduce
out = self._reduce_calc(method, axis, keepdims, **kwargs)
File "C:\<path_redacted>\sparse\_coo\core.py", line 692, in _reduce_calc
data, inv_idx, counts = _grouped_reduce(a.data, a.coords[0], method, **kwargs)
File "C:\<path_redacted>\sparse\_coo\core.py", line 1566, in _grouped_reduce
result = method.reduceat(x, inv_idx, **kwargs)
TypeError: Cannot cast array data from dtype('uint64') to dtype('int64') according to the rule 'safe'
To Reproduce
Create an array a at least as large as 2 ** 32 with at least one nonzero element, then call a.max(). For example:
>>> b = sparse.DOK((2 ** 32,))
>>> b[0] = 1
>>> a = sparse.COO(b)
>>> a.nnz
1
>>> a.max() # TypeError
Expected behavior Return the maximum value of the array (1 in the example above).
System
- OS and version: Windows 10
sparseversion: 0.12.0+44.g765e297 (bug is also present in 0.12.0, installed from pip)- NumPy version: 1.18.5
- Numba version: 0.53.1
Additional context
sparse.COO.max works on an array of size 2 ** 32 if it is empty (i.e. a.nnz == 0).
Issue Analytics
- State:
- Created 2 years ago
- Comments:7
Top Results From Across the Web
Why I can't create an array with large size? - Stack Overflow
Java arrays are accessed via 32-bit ints, resulting in a maximum theoretical array size of 2147483647 elements. But as you can see my...
Read more >NumPy's max() and maximum(): Find Extreme Values in Arrays
In this tutorial, you'll learn how to: Use the NumPy max() function; Use the NumPy maximum() function and understand why it's different from...
Read more >Maximum difference between two elements such that larger ...
Given an array arr[] of integers, find out the maximum difference between any two elements such that larger element appears after the ...
Read more >Find minimum and maximum value in an array - AfterAcademy
Given an array A[] of size n, you need to find the maximum and minimum element present in the array.Your algorithm should make...
Read more >Math.max() - JavaScript - MDN Web Docs
The Math.max() function returns the largest of the numbers given as ... is 2, which weakly signals that it's designed to handle at...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Thanks @GPhilo for digging into this, I’ll try to set some time aside this weekend to fix it and cut a release.
I traced the issue to its source and came up with a hack to make this work, should anyone else also run into this problem. Basically, when this reshape is called, because
idx_typeis ignored, as mentioned in the comment above, it uses the defaultint32idx_type. Sincein32can’t store the new shape, this test checks positive andidx_typegets converted to the result ofnp.min_scalar_type(max(shape)), which isnp.uint64and that’s what causes the problem.My hack to solve this is to hardcode
np.int64instead of letting numpy choose:This solves the problem when calling
max().