question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

`.cache()` fails for MaskedArrays

See original GitHub issue

joblib caching implementation for arguments that are ndarrays depends on .viewing the object as uint8. This is not possible for masked arrays, because their size will change and therefore the mask doesn’t fit anymore. See also https://github.com/numpy/numpy/issues/10074.

$ cat mwe.py 
#!/usr/bin/env python3.6

import numpy
import joblib

memory = joblib.Memory(cachedir="/tmp", verbose=1)

@memory.cache
def foo(x):
    return x+x

x = numpy.ma.MaskedArray(
        numpy.zeros(10, "uint16"),
        mask=numpy.zeros(10, "?"))

print(foo(x))
$ ./mwe.py 
Traceback (most recent call last):
  File "./mwe.py", line 16, in <module>
    print(foo(x))
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 562, in __call__
    return self._cached_call(args, kwargs)[0]
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 499, in _cached_call
    output_dir, argument_hash = self._get_output_dir(*args, **kwargs)
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 585, in _get_output_dir
    argument_hash = self._get_argument_hash(*args, **kwargs)
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/memory.py", line 579, in _get_argument_hash
    coerce_mmap=(self.mmap_mode is not None))
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 263, in hash
    return hasher.hash(obj)
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 69, in hash
    self.dump(obj)
  File "/home/users/gholl/lib/python3.6/pickle.py", line 409, in dump
    self.save(obj)
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 243, in save
    Hasher.save(self, obj)
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 95, in save
    Pickler.save(self, obj)
  File "/home/users/gholl/lib/python3.6/pickle.py", line 476, in save
    f(self, obj) # Call unbound method with explicit self
  File "/home/users/gholl/lib/python3.6/pickle.py", line 821, in save_dict
    self._batch_setitems(obj.items())
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 147, in _batch_setitems
    Pickler._batch_setitems(self, iter(sorted(items)))
  File "/home/users/gholl/lib/python3.6/pickle.py", line 852, in _batch_setitems
    save(v)
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/joblib/hashing.py", line 212, in save
    self._getbuffer(obj_c_contiguous.view(self.np.uint8)))
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/numpy/ma/core.py", line 3146, in view
    output = ndarray.view(self, dtype)
  File "/dev/shm/gerrit/venv/stable-3.6/lib/python3.6/site-packages/numpy/ma/core.py", line 3425, in __setattr__
    self._mask.shape = self.shape
ValueError: cannot reshape array of size 10 into shape (20,)

Using Python 3.6, numpy 1.13.3, and joblib 0.11.

Issue Analytics

  • State:open
  • Created 6 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
Jerry-Macommented, Jan 17, 2020

I just hit this bug. Any update on this?

0reactions
lestevecommented, Nov 27, 2017

I can reproduce the bug. I guess a work-around is to have the memoized function take two arrays as argument, one for the array without mask, one for the mask.

A PR with a non-regression test would be more than welcome, maybe a quick fix is to exclude masked arrays from this if clause here:

https://github.com/joblib/joblib/blob/cbabe65d9b1b52a7be14885f258052d1f26ddb6f/joblib/hashing.py#L190

Read more comments on GitHub >

github_iconTop Results From Across the Web

Scipy "masked arrays are not supported" error - Stack Overflow
I found the solution, which involves a small change in the utils.py file in the pykalman library (line 73):
Read more >
Primary IERS A table has terminating null and crashing io ...
I am still getting the error (ValueError: Column year failed to convert: invalid literal for int() with base 10: '\x00\x00'), but it occurs ......
Read more >
Re: Masked arrays in PyNIO 1.3.0b1 - PyNGL
modify the get_value() method for the final release of > ... the only line that needs to be changed to work around the...
Read more >
2.2. Advanced NumPy - Scipy Lecture Notes
numexpr is designed to mitigate cache effects when evaluating array expressions. numba is a compiler for Python code, that is aware of numpy...
Read more >
masking 2 in python assignment expert - You.com - You.com
The problem with multicores is to ensure proper cache coherence in such a ... Performances are measured with _rdtsc() and are given in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found