question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Parallel crashes when pickling large non-array objects

See original GitHub issue

Parallel crashes when pickling large non-array objects under Joblib 0.10 in Python 2.7.11. What is the reason for this problem? Are there any known workarounds?

Minimalist reproduction script:

from joblib import Parallel, delayed
x = 30
def fun(i): return ['a']*(2**x)
output = Parallel(n_jobs=2)(delayed(fun)(i) for i in range(2))

One of the processes fails after around 1 hour for x >= 30 giving this traceback:

Process PoolWorker-2:
Traceback (most recent call last):
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/process.py", line 114, in run
    self._target(*self._args, **self._kwargs)
  File "/NS/anaconda/anaconda/lib/python2.7/multiprocessing/pool.py", line 122, in worker
    put((job, i, (False, wrapped)))
  File "/NS/anaconda/anaconda/lib/python2.7/site-packages/joblib/pool.py", line 386, in put
    return send(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/site-packages/joblib/pool.py", line 371, in send
    CustomizablePickler(buffer, self._reducers).dump(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 224, in dump
    self.save(obj)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 331, in save
    self.save_reduce(obj=obj, *rv)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 401, in save_reduce
    save(args)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 554, in save_tuple
    save(element)
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 286, in save
    f(self, obj) # Call unbound method with explicit self
  File "/NS/anaconda/anaconda/lib/python2.7/pickle.py", line 492, in save_string
    self.write(BINSTRING + pack("<i", n) + obj)
error: 'i' format requires -2147483648 <= number <= 2147483647

Details of the machine:

$ uname -a
Linux brain02 3.18.28.1.amd64-smp #1 SMP Mon Mar 7 10:59:19 CET 2016 x86_64 GNU/Linux
$ python
Python 2.7.11 |Anaconda 2.3.0 (64-bit)| (default, Dec  6 2015, 18:08:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2

Issue Analytics

  • State:open
  • Created 7 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
nickeubankcommented, Apr 3, 2017

Got a request to put my outputs here.

@lesteve As noted in #514, I think this now IS something joblib should be able to get around given changes in Python in 3.5 and 3.6 that allow for control over pickling protocols.

Minimal replicating example [warning: my comp has 48 gb ram – this may be a little too big for laptops. Haven’t worked hard at the minimum Series size to generate the error]:

import pandas as pd
def prob_func(i):
    return pd.Series(1).repeat(2**28)

from joblib import Parallel, delayed
a = Parallel(n_jobs=2, verbose=40)(delayed(prob_func)(i) for i in range(2))

Output:

def prob_func(i):
    a = pd.Series(1).repeat(2**28)
    return a




from joblib import Parallel, delayed                              
a = Parallel(n_jobs=2, verbose=40, max_nbytes="1.8G")(delayed(prob_func)(i) for i in range(2))

Traceback (most recent call last):

  File "<ipython-input-1-870c0185c434>", line 15, in <module>
    a = Parallel(n_jobs=2, verbose=40, max_nbytes="1.8G")(delayed(prob_func)(i) for i in range(2))

  File "/Users/Nick/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 789, in __call__
    self.retrieve()

  File "/Users/Nick/anaconda/lib/python3.6/site-packages/joblib/parallel.py", line 699, in retrieve
    self._output.extend(job.get(timeout=self.timeout))

  File "/Users/Nick/anaconda/lib/python3.6/multiprocessing/pool.py", line 608, in get
    raise self._value

MaybeEncodingError: Error sending result: '[0    1
0    1

[long list of 0    1  omitted...]

0    1
0    1
dtype: int64]'. Reason: 'error("'i' format requires -2147483648 <= number <= 2147483647",)'
0reactions
ogriselcommented, Sep 6, 2016

Then you can store the results in a persistent store (e.g. files in a folder or records in a DB) and return the identifiers of the files or records instead of large string objects.

Alternatively just reducing the chunk-size to ensure that each job result is smaller than 2GB should solve your issue.

Closing as this cannot be fixed in joblib itself.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How to perform a pickling so that it is robust against crashing?
I routinely use pickle.dump() to save large files in Python 2.7. In my code, I have one .pickle file that I continually update...
Read more >
12.1. pickle — Python object serialization - Python 3.5.2 ...
The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object ......
Read more >
Multiprocessing and Pickle, How to Easily fix that?
Pickling or Serialization transforms from object state into a series of bits — the object could be methods, data, class, API end-points, etc....
Read more >
Release Notes — NumPy v1.16 Manual
The most noticeable change in this release is that unpickling object arrays ... With pickle protocol 5, and the PickleBuffer API, a large...
Read more >
Release Notes — Numba 0.50.1 documentation - PyData |
Large scale removal of unsupported Python and NumPy versions has taken place along with a ... PR #3449: Allow matching non-array objects in...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found