Leverage the new PEP 574 for no-copy pickling of contiguous arrays
See original GitHub issuePEP 574 (scheduled for Python 3.8) introduces pickle protocol 5 with support for no-copy pickling of large mutable buffers.
I made a small proof-of-concept benchmark script using @pitrou’s pickle5 backport of his draft implementation of PEP 547.
See: https://gist.github.com/ogrisel/a2b0e5ae4987a398caa7f9277cb3b90a
The meat lies in the following reducer:
from pickle5 import PickleBuffer
def _array_from_buffer(buffer, dtype, shape):
return np.frombuffer(buffer, dtype=dtype).reshape(shape)
def reduce_ndarray_pickle5(a):
# This reducer assumes protocol 5 as currently there is no way to register
# protocol-aware reduce function in the global copyreg dispatch table.
if not a.dtype.hasobject and a.flags.c_contiguous:
# No-copy pickling for C-contiguous arrays and protocol 5
return _array_from_buffer, (PickleBuffer(a), a.dtype, a.shape), None
else:
# Fall-back to generic method
return a.__reduce__()
This works as expected (no extra copy when dumping and loading) and also fixes the in-memory speed overhead reported in by @mrocklin in #7544.
To get this in numpy, we would need to make a protocol-aware reduce function that is, have ndarray implement a __reduce_ex__ method that accepts a protocol argument instead of the existing bytes-based implementation from array_reduce in https://github.com/numpy/numpy/blob/master/numpy/core/src/multiarray/methods.c#L1577. This bytes-based implementation should probably be kept as a fallback when protocol < 5.
Issue Analytics
- State:
- Created 5 years ago
- Comments:25 (24 by maintainers)

Top Related StackOverflow Question
Closing: #12011 was merged in numpy master.
@pierreglaser did the work. He uses airspeed velocity to do the measurements. I am not sure if the baseline is substracted or not. Probably not.