Multiprocessing error with s3_iter_bucket on Mac OSx
See original GitHub issueOS: 10.14.1 (Mojave) Python: 3.7.2 (brew)
I’ve been using s3_iter_bucket
to traverse a S3 bucket, but no matter how many workers I use (tried the default 16, then 8 and then 1), python crashes with a multiprocessing error.
Not sure if this is an OS or smart_open issue, but do wonder if anyone else experienced it.
This is the relevant bit when I’m calling smart_open:
# ...
# iterate only through one dir at a time
for key, content in s3_iter_bucket(bucket, prefix=bucket_prefix, workers=1):
click.secho(">>>>> File: " + key + str(len(content)), fg="green")
parse_and_index_data(content, index_name, host_name, key)
# ...
And this is the usual error after a few thousand items have been processed (well this is what I see after I hit ctrl-C as python crashes with a system dialog and everything hangs):
Traceback (most recent call last):
File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/pool.py", line 110, in worker
task = get()
File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/queues.py", line 352, in get
res = self._reader.recv_bytes()
File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
Any hints? One workaround would be to be able to set _MULTIPROCESSING = False
when calling s3_iter_bucket
, but that is not possible at the moment…
Issue Analytics
- State:
- Created 5 years ago
- Comments:13 (5 by maintainers)
Top Results From Across the Web
python - Multiprocessing example giving AttributeError on Mac
your specific error is from trying to run multiprocessing code in an interactive mode using "spawn" as the startmethod (the new standard method...
Read more >AttributeError when running multiprocessing on MacOS 11 ...
I just run the sample code from multiprocessing's documentation page: ... file and import it in the main file, there is no error...
Read more >If an error occurred while updating or installing macOS
The message might say that an error occurred while downloading, preparing, or installing, or that the installer is damaged or could not be...
Read more >OS X Installation - ImageJ - NIH
Download ImageJ for Mac OS X from the Download page. ... You will get an error message if you do not have write...
Read more >Troubleshoot issues with updating Acrobat or Reader on Mac ...
You receive an error when trying to update Acrobat or Reader. ... Refer the article Install Adobe Acrobat Reader | Mac OS for...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Might it be related to: https://bugs.python.org/issue33725 ?
I was having messages like this:
doing:
before execution was a workaround for me.
Hey folks thanks for the feedback. I also thought it was a memory issue but didn’t notice anything out of the ordinary there. Will look at it again.
Also, I’ll try the
sleep(5)
test later today.