question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Fast reader in io.ascii does not work with multiprocessing 'spawn' method

See original GitHub issue

The multiprocessing module in Python supports several modes - including fork and spawn. However, currently things don’t work properly with the spawn method:

import multiprocessing as mp
from astropy.io.ascii import read

if __name__ == "__main__":

    mp.set_start_method('spawn')
    print(read('a,b\n1,2\n3,4\n5,\n6,7', fast_reader={'parallel': True}))

gives:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
  File "astropy/io/ascii/cparser.pyx", line 836, in astropy.io.ascii.cparser._copy_cparser
  File "astropy/io/ascii/cparser.pyx", line 192, in astropy.io.ascii.cparser.CParser.__cinit__
TypeError: __cinit__() got an unexpected keyword argument 'expchar'
...

This will cause issues for Python 3.8 since the default method on MacOS X is changing from fork to spawn (see https://github.com/python/cpython/commit/17a5588740b3d126d546ad1a13bdac4e028e6d50). The easy fix is of course to hard-code the method to fork in the fast reader, but this doesn’t really fix the underlying issue. It would be good to fix this properly since there is probably a good reason the Python dev team switched to spawn (their changelog entry says ’ On macOS, spawn start method is now the default: fork start method is no longer reliable on macOS, see https://bugs.python.org/issue33725.')

In any case the above issue can be reproduced with existing Python versions, so there’s definitely a bug here to fix.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
astrofrogcommented, Jun 16, 2019
0reactions
saimncommented, Jun 17, 2019

Fixed by #8853, thanks @astrofrog 🎉

Read more comments on GitHub >

github_iconTop Results From Across the Web

No increase in speed when multithreading python hdf5 ...
As my parsing function seems to be cpu bound (the conversion of integer to characters) and not i/o bound, I expected to obtain...
Read more >
Python Multiprocessing Pool: The Complete Guide
The Python Multiprocessing Pool class allows you to create and manage process pools in Python. Although the Multiprocessing Pool has been ...
Read more >
Python Parallel Processing - Tips and Applications
When you start typing, the process spawns a number of threads: one to read ... Multiprocessing - can speed up Python operations that...
Read more >
Fork vs Spawn in Python Multiprocessing
I recently got stuck trying to plot multiple figures in parallel with Matplotlib.It took five hours to find a two-line fix to make...
Read more >
Machine Problem 3: Multithreaded Sorting Program
You will proceed sorting these files by spawning a worker thread to sort each file. After the worker threads complete their execution, you...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found