behavior of `is_total_slice` for boundary chunks
See original GitHub issueThe is_total_slice
function is used to determine whether a selection corresponds to an entire chunk (is_total_slice
-> True
) or not (is_total_slice
-> False
).
I noticed that is_total_slice
is always False
for “partial chunks”, i.e. chunks that are not filled by the array:
import zarr
from zarr.util import is_total_slice
# create an array with 1 full chunk and 1 partial chunk
a = zarr.open('test.zarr', path='test', shape=(10,), chunks=(9,), dtype='uint8', mode='w')
for x in BasicIndexer(slice(None), a):
print(x.chunk_selection, is_total_slice(x.chunk_selection, a._chunks))
Which prints this:
(slice(0, 9, 1),) True
(slice(0, 1, 1),) False
Although the last selection is not the size of a full chunk, it is “total” with respect to the output of that selection in the array.
A direct consequence of this behavior is unnecessary chunk loading when performing array assignments – zarr uses the result of is_total_slice
to decide whether to load existing chunk data or not. Because is_total_slice
is always False
for partial chunks, zarr always loads boundary chunks during assignment.
A solution to this would be to augment the is_total_slice
function to account for partial chunks. I’m not sure at the moment how to do this exactly, but it shouldn’t be hard. Happy to bring forth a PR if people agree that this is an issue worth addressing.
Issue Analytics
- State:
- Created 2 years ago
- Comments:5 (2 by maintainers)
I made some tests and I think there is actually no way to reach the desired behavior of
is_total_slice
since the modification proposed by @d-v-b relies on the relative position of the slice in the array, and this kind of information is not delivered tois_total_slice
.One possible solution would be to add a new (maybe optional) parameter to
is_total_slice
to communicate the relative position of the slice (i.e. the upper left corner).In this case I would like to work on the problem.
Just a question to check whether I understood the key idea of
zarr
.If we consider the example provided by @d-v-b, I would expect the result of
is_total_slice(slice(1,10,1), a._chunks)
to beFalse
, since the array is divided into chunksa[0:9]
anda[9]
, thereforea[1:10]
is not a chunk. However it looks like this test returnsTrue
. Am I doing something wrong?