Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parallel crashes when pickling large non-array objects

See original GitHub issue

Parallel crashes when pickling large non-array objects under Joblib 0.10 in Python 2.7.11. What is the reason for this problem? Are there any known workarounds?

Minimalist reproduction script:

from joblib import Parallel, delayed
x = 30
def fun(i): return ['a']*(2**x)
output = Parallel(n_jobs=2)(delayed(fun)(i) for i in range(2))

One of the processes fails after around 1 hour for x >= 30 giving this traceback:

Process PoolWorker-2:
Traceback (most recent call last):
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/pool.py", line 122, in worker
    put((job, i, (False, wrapped)))
  File "/NS/anaconda/anaconda/lib/python2.7/site-packages/joblib/pool.py", line 386, in put
    return send(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/site-packages/joblib/pool.py", line 371, in send
    CustomizablePickler(buffer, self._reducers).dump(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 401, in save_reduce
    save(args)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 492, in save_string
    self.write(BINSTRING + pack("<i", n) + obj)
error: 'i' format requires -2147483648 <= number <= 2147483647

Details of the machine:

$ uname -a
Linux brain02 3.18.28.1.amd64-smp #1 SMP Mon Mar 7 10:59:19 CET 2016 x86_64 GNU/Linux
$ python
Python 2.7.11 |Anaconda 2.3.0 (64-bit)| (default, Dec  6 2015, 18:08:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2

Issue Analytics

State:
Created 7 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

nickeubankcommented, Apr 3, 2017

Got a request to put my outputs here.

@lesteve As noted in #514, I think this now IS something joblib should be able to get around given changes in Python in 3.5 and 3.6 that allow for control over pickling protocols.

Minimal replicating example [warning: my comp has 48 gb ram – this may be a little too big for laptops. Haven’t worked hard at the minimum Series size to generate the error]:

import pandas as pd
def prob_func(i):
    return pd.Series(1).repeat(2**28)

from joblib import Parallel, delayed
a = Parallel(n_jobs=2, verbose=40)(delayed(prob_func)(i) for i in range(2))

Output:

def prob_func(i):
    a = pd.Series(1).repeat(2**28)
    return a




from joblib import Parallel, delayed                              
a = Parallel(n_jobs=2, verbose=40, max_nbytes="1.8G")(delayed(prob_func)(i) for i in range(2))

Traceback (most recent call last):

  File "<ipython-input-1-870c0185c434>", line 15, in <module>
    a = Parallel(n_jobs=2, verbose=40, max_nbytes="1.8G")(delayed(prob_func)(i) for i in range(2))

  File "/Users/Nick/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 789, in __call__
    self.retrieve()

  File "/Users/Nick/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 699, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "/Users/Nick/anaconda/lib/python3.6/multiprocessing/pool.py", line 608, in get
    raise self._value

MaybeEncodingError: Error sending result: '[0    1
0    1

[long list of 0    1  omitted...]

0    1
0    1
dtype: int64]'. Reason: 'error("'i' format requires -2147483648 <= number <= 2147483647",)'

0reactions

ogriselcommented, Sep 6, 2016

Then you can store the results in a persistent store (e.g. files in a folder or records in a DB) and return the identifiers of the files or records instead of large string objects.

Alternatively just reducing the chunk-size to ensure that each job result is smaller than 2GB should solve your issue.

Closing as this cannot be fixed in joblib itself.

Top Results From Across the Web

How to perform a pickling so that it is robust against crashing?

I routinely use pickle.dump() to save large files in Python 2.7. In my code, I have one .pickle file that I continually update...

12.1. pickle — Python object serialization - Python 3.5.2 ...

The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object ......

Multiprocessing and Pickle, How to Easily fix that?

Pickling or Serialization transforms from object state into a series of bits — the object could be methods, data, class, API end-points, etc....

Release Notes — NumPy v1.16 Manual

The most noticeable change in this release is that unpickling object arrays ... With pickle protocol 5, and the PickleBuffer API, a large...

Release Notes — Numba 0.50.1 documentation - PyData |

Large scale removal of unsupported Python and NumPy versions has taken place along with a ... PR #3449: Allow matching non-array objects in...