question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

array.overlap and array.map_overlap block sizes are incorrect when depth is an unsigned bit type

See original GitHub issue

Minimal Example:

import numpy as np
import dask.array as da

def func(block, block_info=None):
    print(block_info[0]['array-location'])
    print(block.shape)
    return block

x = da.ones((100,), chunks=(50,50))

d_signed = tuple( np.array([10, 10]).astype(np.int16) )
y_signed = da.map_overlap(func, x, dtype=x.dtype, depth=d_signed)

d_unsigned = tuple( np.array([10, 10]).astype(np.uint16) )
y_unsigned = da.map_overlap(func, x, dtype=x.dtype, depth=d_unsigned)
y_signed.compute()

[(0, 70)] [(70, 140)] (70,) (70,)

y_unsigned.compute()

[(70, 110)] (40,) [(0, 70)] (60,)

I gather that map_overlap simply pads each block with data from its neighbors, then calls map_blocks on the “augmented” array. I suspect that when this augmented array is sliced there is probably an expression like: new_block = augmented_array[d[0]:-d[0], ...] where d is the depth tuple. Simply taking the negative for the end of the slice does not work for unsigned bit types. This should be replaced with augmented_array[d[0]:-1*d[0], ...] to ensure any unsigned bit type is properly cast before slicing.

Why should we support unsigned bit type arguments here?

The depth argument is intrinsically an unsigned quantity, there is no “negative depth” in this context. It’s reasonable for users to expect this data type to be accepted. When depths are computed automatically from things like array shapes (e.g. my_depth = tuple( np.array( my_array.shape // 8 ) ) for an approximate 12.5% overlap) they will often be unsigned bit types.

Environment:

  • Dask version: 2.20.0
  • Python version: 3.6.4
  • Operating System: Scientific Linus
  • Install method (conda, pip, source): pip

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

1reaction
jsignellcommented, Nov 30, 2020

Both da.map_overlap and da.overlap.overlap call da.coerce_depth to reformat the depth data structure. The simplest solution would be to wrap the depth elements with int() here in this parsing function.

This seems like a reasonable solution to me. Type coersion should happen sooner rather than later so that it can fail quickly.

1reaction
GFleishmancommented, Nov 25, 2020

Sorry for delay - I’m still willing to fix and submit PR, just have deliverables at work that take priority. I may take some time over the holiday to look into this.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Understanding Data Types in Python
This section outlines and contrasts how arrays of data are handled in the Python language itself, and how NumPy improves on this.
Read more >
8-bit unsigned integer arrays - MATLAB - MathWorks
Variables in MATLAB ® of data type (class) uint8 are stored as 1-byte (8-bit) unsigned integers. For example: y = uint8(10); whos y....
Read more >
Typed Arrays and ArrayBuffers / Jules Blom - Observable
The name typed array is a bit misleading though, as they are really views and not actual Array s, they merely reference the...
Read more >
Primitive Data Types - Getting Started With Rust - CodinGame
So the type i8 is an 8 bit, integer and a u64 is an unsigned, 64 bit integer. isize and usize ... An...
Read more >
How do I create bit array in Javascript? - Stack Overflow
Here's one I whipped up: UPDATE - something about this class had been bothering me all day - it wasn't size based -...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found