Proper parallel optimisation
See original GitHub issueI have a single threaded function which I want to optimise. I’m trying to write a wrapper that would handle multiple runs at the same time, but I’m noticing considerable degradation in results as I increase the number of parallel evaluations.
Here is the rough logic of what I’m doing:
from __future__ import annotations
from pprint import pprint
from typing import Callable
import numpy as np
from bayes_opt.bayesian_optimization import BayesianOptimization
from bayes_opt.util import UtilityFunction
from tqdm import tqdm, trange
def multivariable_func(r: float, x: float, y: float, diff: float) -> float:
r = int(r)
diff = diff > 0.5
loss = (r - 5) ** 2
loss += (x**2 + y**2 - r) ** 2
loss += abs(x - y) * (-1) ** int(diff)
loss += 0.5 * x
loss += -0.25 * int(diff)
return -loss
def optimize(func: Callable[..., float], num_iter: int, bounds: dict[str, tuple[float, float]], num_workers=0):
init_samples = int(np.sqrt(num_iter))
optimizer = BayesianOptimization(f=None, pbounds=bounds, verbose=0)
init_kappa = 10
kappa_decay = (0.1 / init_kappa) ** (1 / (num_iter - init_samples))
utility = UtilityFunction(
kind="ucb", kappa=init_kappa, xi=0.0, kappa_decay=kappa_decay, kappa_decay_delay=init_samples
)
init_queue = [optimizer.suggest(utility) for _ in range(init_samples)]
result_queue = []
tbar = tqdm(total=num_iter, leave=False)
while len(optimizer.res) < num_iter:
sample = init_queue.pop(0) if init_queue else optimizer.suggest(utility)
loss = func(**sample)
result_queue.append((sample, loss))
if len(result_queue) >= num_workers:
try:
optimizer.register(*result_queue.pop(0))
utility.update_params()
tbar.update()
except KeyError:
pass
return optimizer.max
bounds = {"r": [-10, 10], "x": [-10, 10], "y": [-10, 10], "diff": [0, 1]}
all_results = {}
for num_workers in tqdm([1, 2, 4, 8], desc="Checking num_workers"):
results = []
for idx in trange(2, desc=f"Sampling with {num_workers=}"):
best = optimize(multivariable_func, 400, bounds, num_workers)
results.append(best["target"])
all_results[num_workers] = np.mean(results)
tqdm.write(f"Result for optimizing with {num_workers=}: {all_results[num_workers]}")
print("\n")
pprint(all_results)
The result_queue
variable is simulating evaluation across multiple processes.
Here are the results:
{1: 4.320798413579277,
2: 4.320676522735756,
4: 3.5379530743926133,
8: 2.175667857740832}
As can be seen, the more processes I use, the worse the final result is. I don’t understand why that would happen, even if a couple fewer suggestions are evaluated properly, the results should not differ so much.
What am I missing?
Issue Analytics
- State:
- Created a year ago
- Reactions:1
- Comments:15
Top Results From Across the Web
Improving Optimization Performance with Parallel Computing
This article describes two ways to use parallel computing to accelerate the solution of computationally expensive optimization problems.
Read more >Parallel optimization algorithms for High Performance ...
3 Load balancing methods for parallel optimization algorithms ... A crucial aspect for the proper construction of a response surface by means of...
Read more >Parallel optimization algorithms for a problem with very ...
You can always calculate the gradient in parallel (for quasi-Newton methods using finite differences) and get a speedup proportional to the ...
Read more >(PDF) Parallel optimization: Theory, algorithms, and applications
PDF | This book offers a unique pathway to methods of parallel optimization by introducing parallel computing ideas and techniques into both ...
Read more >Speeding Up Optimization Problems Using Parallel Computing
Speeding Up Optimization Problems Using Parallel Computing. 4.8K views 7 years ago. MATLAB SOFTWARE. MATLAB SOFTWARE. 11.7K subscribers.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
OK, thank you. I’ve done some tests and here are the results for my example:
These are averaged over 10 runs with different seeds to remove the effect of lucky guesses.
In this toy example, I would say anything above 4.3 is satisfactory. 8 workers require 700 iterations instead of 500, which equates to 470% speed-up. Currently, this is fine for me, so I will stop working on analysing this for now.
looks really promising! Sorry I think it is going to take me a few days to look at this properly though 😦