Parallel job takes more time than non-parallel job?
See original GitHub issueIt seems that my batches are executed one by one rather than been parallel? I’m using iPython. It’s 32 core machine.
import pandas, numpy, hashlib
from joblib import Parallel, delayed
d = pandas.DataFrame({'a': numpy.random.randn(1000 * 1000)})
def hash(x):
return hashlib.sha256(x.encode('utf-8')).hexdigest()
The output:
In [35]: Parallel(n_jobs=-1, batch_size=100 * 1000, verbose=20) (delayed(hash) (str(row[1:])) for row in (d.itertuples()))
[Parallel(n_jobs=-1)]: Done 32 tasks | elapsed: 0.0s
[Parallel(n_jobs=-1)]: Done 100032 tasks | elapsed: 6.8s
[Parallel(n_jobs=-1)]: Done 200032 tasks | elapsed: 13.5s
[Parallel(n_jobs=-1)]: Done 300032 tasks | elapsed: 20.1s
[Parallel(n_jobs=-1)]: Done 400032 tasks | elapsed: 26.9s
[Parallel(n_jobs=-1)]: Done 500032 tasks | elapsed: 33.9s
[Parallel(n_jobs=-1)]: Done 600032 tasks | elapsed: 40.8s
[Parallel(n_jobs=-1)]: Done 700032 tasks | elapsed: 47.5s
[Parallel(n_jobs=-1)]: Done 800032 tasks | elapsed: 54.2s
Issue Analytics
- State:
- Created 8 years ago
- Comments:12 (6 by maintainers)
Top Results From Across the Web
Parallel execution takes more time than the non- ...
Issue - Parallel execution(with number of parallel execution 2) took 338 milli seconds where as non parallel execution took 49 seconds to get...
Read more >joblib - the parallel code takes more time than the non- ...
I am first time using joblib. I am using jupyter notebook on windows. it is 16 core machine. It seems that my code...
Read more >When does too much parallelism affect performance?
It appears the non-parallel execution had a shorter elapsed time of 128 second compared to the parallel plan taking 148 seconds. The non- ......
Read more >Why my tests are slower when I run more parallel CI nodes ...
Let's say you run 10 parallel jobs (parallel CI nodes) on your CI server. Your slowest test file spec/my_slow_spec.rb takes 2 minutes to...
Read more >Performance in Parallel query - Ask TOM
Putting 10 programmers to work on a subroutine might take longer then letting 1 good ... I have hardly obtained better performance than...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

It worked. Thanks. The execution time is 2 seconds rather than 12 seconds.
from math import sqrt from joblib import Parallel, delayed from numpy import square import numpy as np import time
n=10000000 def sim(x):
time.sleep(100)
chunk=[] arr=np.arange(n) for i in range(0,len(arr),100): chunk.append(arr[i:(i+100)])
if name == “main”:
I did the same. But the time taken by a single core is less than using 8 cores.