`.cache()` fails for MaskedArrays
See original GitHub issuejoblib caching implementation for arguments that are ndarrays depends on .viewing the object as uint8. This is not possible for masked arrays, because their size will change and therefore the mask doesn’t fit anymore. See also https://github.com/numpy/numpy/issues/10074.
$ cat mwe.py
#!/usr/bin/env python3.6
import numpy
import joblib
memory = joblib.Memory(cachedir="/tmp", verbose=1)
@memory.cache
def foo(x):
return x+x
x = numpy.ma.MaskedArray(
numpy.zeros(10, "uint16"),
mask=numpy.zeros(10, "?"))
print(foo(x))
$ ./mwe.py
Traceback (most recent call last):
File "./mwe.py", line 16, in <module>
print(foo(x))
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 562, in __call__
return self._cached_call(args, kwargs)[0]
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 499, in _cached_call
output_dir, argument_hash = self._get_output_dir(*args, **kwargs)
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 585, in _get_output_dir
argument_hash = self._get_argument_hash(*args, **kwargs)
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 579, in _get_argument_hash
coerce_mmap=(self.mmap_mode is not None))
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 263, in hash
return hasher.hash(obj)
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 69, in hash
self.dump(obj)
File "/home/users/gholl/lib/python3.6/pickle.py", line 409, in dump
self.save(obj)
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 243, in save
Hasher.save(self, obj)
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 95, in save
Pickler.save(self, obj)
File "/home/users/gholl/lib/python3.6/pickle.py", line 476, in save
f(self, obj) # Call unbound method with explicit self
File "/home/users/gholl/lib/python3.6/pickle.py", line 821, in save_dict
self._batch_setitems(obj.items())
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 147, in _batch_setitems
Pickler._batch_setitems(self, iter(sorted(items)))
File "/home/users/gholl/lib/python3.6/pickle.py", line 852, in _batch_setitems
save(v)
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 212, in save
self._getbuffer(obj_c_contiguous.view(self.np.uint8)))
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/numpy/ma/core.py", line 3146, in view
output = ndarray.view(self, dtype)
File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/numpy/ma/core.py", line 3425, in __setattr__
self._mask.shape = self.shape
ValueError: cannot reshape array of size 10 into shape (20,)
Using Python 3.6, numpy 1.13.3, and joblib 0.11.
Issue Analytics
- State:
- Created 6 years ago
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Scipy "masked arrays are not supported" error - Stack Overflow
I found the solution, which involves a small change in the utils.py file in the pykalman library (line 73):
Read more >Primary IERS A table has terminating null and crashing io ...
I am still getting the error (ValueError: Column year failed to convert: invalid literal for int() with base 10: '\x00\x00'), but it occurs ......
Read more >Re: Masked arrays in PyNIO 1.3.0b1 - PyNGL
modify the get_value() method for the final release of > ... the only line that needs to be changed to work around the...
Read more >2.2. Advanced NumPy - Scipy Lecture Notes
numexpr is designed to mitigate cache effects when evaluating array expressions. numba is a compiler for Python code, that is aware of numpy...
Read more >masking 2 in python assignment expert - You.com - You.com
The problem with multicores is to ensure proper cache coherence in such a ... Performances are measured with _rdtsc() and are given in...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

I just hit this bug. Any update on this?
I can reproduce the bug. I guess a work-around is to have the memoized function take two arrays as argument, one for the array without mask, one for the mask.
A PR with a non-regression test would be more than welcome, maybe a quick fix is to exclude masked arrays from this
if clausehere:https://github.com/joblib/joblib/blob/cbabe65d9b1b52a7be14885f258052d1f26ddb6f/joblib/hashing.py#L190