ENH: parallel processing for ufuncs
See original GitHub issueufuncs support lots of great features, but one which seems to be missing is parallel processing. Below is a quick benchmark which shows a ~3x reduction in time using 4 threads (my machine has 4 physical cores). I don’t know what the API should be, perhaps a np.parallel(True)
call would turn it on for subsequent calls as I would hate to have to litter the code with additional kwargs. Also there would probably have to be some logic about the relative CPU intensity vs memory bandwith of the ufunc call since simple ones like addition probably wont go any faster and might be worse. I read about python not really supporting multithreading, but will all the processing intensive numpy calls releasing the GIL I think it works just fine here.
import timeit
setup = """
import numpy as np
from concurrent.futures import ThreadPoolExecutor
size = 1e7
threads = 4
x = np.linspace(0, 2*np.pi, size)
"""
stmt = """
with ThreadPoolExecutor(max_workers=threads) as executor:
indices = np.linspace(0, size, threads+1).astype(np.int)
for i in range(threads):
executor.submit(lambda x: np.sin(x, out=x), x[indices[i]:indices[i+1]])
"""
print(timeit.repeat(stmt, setup, number=10))
print(timeit.repeat("np.sin(x, out=x)", setup, number=10))
Issue Analytics
- State:
- Created 7 years ago
- Reactions:1
- Comments:5 (5 by maintainers)
Top Results From Across the Web
Automatic parallelization with @jit - Numba
Setting the parallel option for jit() enables a Numba transformation pass that attempts to automatically parallelize and perform other optimizations on (part of) ......
Read more >Creating NumPy universal functions - Numba documentation
Creating a traditional NumPy ufunc is not the most straightforward process and involves writing some C code. Numba makes this easy. Using the...
Read more >joblib Documentation - Read the Docs
2. easy simple parallel computing ... However, it works on numpy ufuncs: ... 2.4.2 Thread-based parallelism vs process-based parallelism.
Read more >SciPy 1.4.0 Release Notes
#10417: DOC: special: don't mark non-ufuncs with a `[+]`. #10423: FIX: Use pybind11::isinstace to ... #10614: ENH: Add parallel computation to scipy.fft.
Read more >BATTERIES INCLUDED PYTHON - Complexity Sciences Center
Computing in Science & Engineering is a peer-reviewed, joint publication of the IEEE Computer ... algorithms (high-performance and parallel computing,.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I respect that its out of scope. I think that in the long run high performance single threaded computing is a contradiction in terms. I guess your view is that numpy is a building block towards that, not the complete solution. Processing ufuncs in parallel is a trivially parallelizable task, but that’s another contradiction in terms.
Indeed.