question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Multiprocessing error with s3_iter_bucket on Mac OSx

See original GitHub issue

OS: 10.14.1 (Mojave) Python: 3.7.2 (brew)

I’ve been using s3_iter_bucket to traverse a S3 bucket, but no matter how many workers I use (tried the default 16, then 8 and then 1), python crashes with a multiprocessing error.

Not sure if this is an OS or smart_open issue, but do wonder if anyone else experienced it.

This is the relevant bit when I’m calling smart_open:

# ...
# iterate only through one dir at a time
for key, content in s3_iter_bucket(bucket, prefix=bucket_prefix, workers=1):
      click.secho(">>>>> File: " + key + str(len(content)), fg="green")
      parse_and_index_data(content, index_name, host_name, key)
# ...

And this is the usual error after a few thousand items have been processed (well this is what I see after I hit ctrl-C as python crashes with a system dialog and everything hangs):

Traceback (most recent call last):
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 110, in worker
    task = get()
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/queues.py", line 352, in get
    res = self._reader.recv_bytes()
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt

Any hints? One workaround would be to be able to set _MULTIPROCESSING = False when calling s3_iter_bucket, but that is not possible at the moment…

Issue Analytics

  • State:open
  • Created 5 years ago
  • Comments:13 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
yanickccommented, Jan 22, 2020

Might it be related to: https://bugs.python.org/issue33725 ?

I was having messages like this:

objc[70381]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[70381]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

doing:

export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES

before execution was a workaround for me.

1reaction
lambdamusiccommented, Mar 8, 2019

Hey folks thanks for the feedback. I also thought it was a memory issue but didn’t notice anything out of the ordinary there. Will look at it again.

Also, I’ll try the sleep(5) test later today.

Read more comments on GitHub >

github_iconTop Results From Across the Web

python - Multiprocessing example giving AttributeError on Mac
your specific error is from trying to run multiprocessing code in an interactive mode using "spawn" as the startmethod (the new standard method...
Read more >
AttributeError when running multiprocessing on MacOS 11 ...
I just run the sample code from multiprocessing's documentation page: ... file and import it in the main file, there is no error...
Read more >
If an error occurred while updating or installing macOS
The message might say that an error occurred while downloading, preparing, or installing, or that the installer is damaged or could not be...
Read more >
OS X Installation - ImageJ - NIH
Download ImageJ for Mac OS X from the Download page. ... You will get an error message if you do not have write...
Read more >
Troubleshoot issues with updating Acrobat or Reader on Mac ...
You receive an error when trying to update Acrobat or Reader. ... Refer the article Install Adobe Acrobat Reader | Mac OS for...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found