question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

struct.error: 'i' format requires -2147483648 <= number <= 2147483647

See original GitHub issue

Hello, I am using joblib to parallelize the computation of a feature matrix, a large numpy array of floats (~7k rows and ~10k columns, ~70M values).

My code breaks at this point:

user_item_features = Parallel(n_jobs=n_jobs)(
    delayed(self._compute_features)(data, recommender, users_list)
    for users_list in users_list_chunks
)

with this error:

Traceback (most recent call last):
  File "entity2rec/node2vec_recommender.py", line 138, in <module>
    n_jobs=args.workers, supervised=False)
  File "/home/semantic/Repositories/entity2rec/entity2rec/evaluator.py", line 255, in features
    users_list_chunks, n_jobs)
  File "/home/semantic/Repositories/entity2rec/entity2rec/evaluator.py", line 269, in _compute_features_parallel
    for users_list in users_list_chunks)
  File "/home/semantic/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 789, in __call__
    self.retrieve()
  File "/home/semantic/anaconda3/lib/python3.6/site-packages/joblib/parallel.py", line 699, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/semantic/anaconda3/lib/python3.6/multiprocessing/pool.py", line 608, in get
    raise self._value
  File "/home/semantic/anaconda3/lib/python3.6/multiprocessing/pool.py", line 385, in _handle_tasks
    put(task)
  File "/home/semantic/anaconda3/lib/python3.6/site-packages/joblib/pool.py", line 372, in send
    self._writer.send_bytes(buffer.getvalue())
  File "/home/semantic/anaconda3/lib/python3.6/multiprocessing/connection.py", line 200, in send_bytes
    self._send_bytes(m[offset:offset + size])
  File "/home/semantic/anaconda3/lib/python3.6/multiprocessing/connection.py", line 393, in _send_bytes
    header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647

I have obtained this error using Linux and different versions of Python:

  • python 3.6.6, 3.6.3, 3.6.0
  • joblib 0.11

Any help would be appreciated. Thank you for your work. Enrico

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

4reactions
ogriselcommented, Aug 2, 2018

Alright, it’s caused by a limitation in a low level multiprocessing routine. If your dictionary does not change often, it’s better to serialize it on the disk (using pickle.load / pickle.dump from the standard library which is faster than joblib for this kinds of objects) and then load it in your workers in your own code instead of passing it as an argument to the parallel function.

We cannot do anything at the joblib level in this case, so closing.

0reactions
phesami-zestcommented, Oct 28, 2020

Alright, it’s caused by a limitation in a low level multiprocessing routine. If your dictionary does not change often, it’s better to serialize it on the disk (using pickle.load / pickle.dump from the standard library which is faster than joblib for this kinds of objects) and then load it in your workers in your own code instead of passing it as an argument to the parallel function.

We cannot do anything at the joblib level in this case, so closing.

Hi @ogrisel, I’m facing a similar issue and can’t upgrade to py38 due to other compatibility issues. Can you please elaborate on the workaround you mentioned here a little more? Do you mean to use the multiprocessing backend which uses the native pickle library to serialize data? ref–> https://joblib.readthedocs.io/en/latest/auto_examples/serialization_and_wrappers.html

Read more comments on GitHub >

github_iconTop Results From Across the Web

python struct.error: 'i' format requires -2147483648 <= number ...
You produced an object that when pickled is larger than fits in a i struct formatter (a four-byte signed integer), which breaks the...
Read more >
struct.error: 'i' format requires -2147483648 <= number ...
This issue usually arises when the dataset is very large and the image resizing value for the image is set to a large...
Read more >
Python multiprocessing struct.error: 'i' format requires
error : 'i' format requires -2147483648 <= number <= 2147483647. I have a huge text data and at certain amount of data (usually...
Read more >
Handing Struct Error when Using Python Multi-processing Pool
Pool to process large data, I met this strange error. struct.error: 'i' format requires -2147483648 <= number <= 2147483647.
Read more >
[Example code]-python struct.error: 'i' format requires
Coding example for the question python struct.error: 'i' format requires -2147483648 <= number <= 2147483647.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found