Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Memoryviews and compression

See original GitHub issue

Currently there are some issues with how we handle memoryviews.

We assume that len(mv) == mv.nbytes in several places. This is not the case for non-trivial shape or itemsize
We slice into memoryviews in at least one place (see distributed/protocol/utils.py), which is also not correct for non-trivial shape or itemsize
To support these concerns the Numpy serialization code currently always produces memoryviews that have strides (1,). This loses important information that stops intelligent compression.

I think that ideally we would propagate itemsize and strides information on memoryviews until after we pass through compression. Then we might consider flattening memoryviews before they enter the network layer (tornado.iostream.IOStream.write) or perhaps earlier. https://stackoverflow.com/questions/44486048/how-to-flatten-a-memoryview

Currently things seem safe but inefficient when itemsize might be useful.

Issue Analytics

State:
Created 6 years ago
Comments:19 (19 by maintainers)

Top GitHub Comments

1reaction

mrocklincommented, Jun 14, 2017

This was helpful. Thank you. It looks like this problem only arises when we are moving enough data around.

On Wed, Jun 14, 2017 at 3:42 AM, Simon Perkins notifications@github.com wrote:

Note that there are some arrays with zero dimensions in them, but commenting them out did not prevent the problem occurring.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dask/distributed/issues/1159#issuecomment-308347825, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszGPDibfurpBLqDaZA7ZcMq-dHWYmks5sD47mgaJpZM4N2c4H .

0reactions

mrocklincommented, Jun 16, 2017

We now pass through dtype information. We don’t yet pass through strides. I’ll wait on strides until we have a concrete application that needs N-d compression. Closing