Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Expected `choose_generation_strategy_kwargs` `num_trials` to control `optimization_complete` bool during `get_next_trials`

See original GitHub issue

Not a blocker for me, but just leaving here for provenance. In practice, I’d typically defer to using AxSearch via RayTune’s interface to do asynchronous optimization, but I wanted to have an example that does batch optimization with trials in a batch running in parallel.

import ray
from ax.service.ax_client import AxClient
from ax.utils.measurement.synthetic_functions import branin

batch_size = 2

ax_client = AxClient()
ax_client.create_experiment(
    parameters=[
        {"name": "x1", "type": "range", "bounds": [-5.0, 10.0]},
        {"name": "x2", "type": "range", "bounds": [0.0, 15.0]},
    ],
    objective_name="branin",
    minimize=True,
    # Sets max parallelism to 10 for all steps of the generation strategy.
    choose_generation_strategy_kwargs={
        "num_trials": 21,
        "max_parallelism_override": batch_size,
    },
)


@ray.remote
def evaluate(parameters):
    return {"branin": branin(parameters["x1"], parameters["x2"])}

optimization_complete = False
while not optimization_complete:
    trial_mapping, optimization_complete = ax_client.get_next_trials(batch_size)
    
    # start running trials in a queue (new trials will start as resources are freed)
    futures = [evaluate.remote(parameters) for parameters in trial_mapping.values()]
    
    # wait for all trials in the batch to complete before continuing (i.e. blocking)
    results = ray.get(futures)
    
    # report the completion of trials to the Ax client
    for trial_index, raw_data in zip(trial_mapping.keys(), results):
        ax_client.complete_trial(trial_index=trial_index, raw_data=raw_data)

I figured one workaround would be:

    ...
    choose_generation_strategy_kwargs={
        "max_parallelism_override": batch_size,
        "enforce_sequential_optimization": False,
    },
    ...
...
num_batches = np.ceil(total_trials / batch_size).astype(int)
for batch_num in range(num_batches):
    current_batch_size = (
        batch_size
        if batch_num < num_batches - 1
        else total_trials - batch_num * batch_size
    )
    trial_mapping, optimization_complete = ax_client.get_next_trials(current_batch_size)
	...

but in the batch immediately after the transition from Sobol to GPEI, it only produces one trial instead of two.

So one real workaround is:

import numpy as np
from ax.utils.measurement.synthetic_functions import branin

batch_size = 2
total_trials = 11

ax_client = AxClient()
ax_client.create_experiment(
    parameters=[
        {"name": "x1", "type": "range", "bounds": [-5.0, 10.0]},
        {"name": "x2", "type": "range", "bounds": [0.0, 15.0]},
    ],
    objective_name="branin",
    minimize=True,
    # Sets max parallelism to 10 for all steps of the generation strategy.
    choose_generation_strategy_kwargs={
        "max_parallelism_override": batch_size,
        "enforce_sequential_optimization": False,
    },
)

@ray.remote
def evaluate(parameters):
    return {"branin": branin(parameters["x1"], parameters["x2"])}

iter_num = 0
num_batches = np.ceil(total_trials / batch_size).astype(int)
for batch_num in range(num_batches):
    current_batch_size = (
        batch_size
        if batch_num < num_batches - 1
        else total_trials - batch_num * batch_size
    )
    
    trials = [ax_client.get_next_trial() for _ in range(current_batch_size)]
    parameters_batch, trial_indices = zip(*trials)

    # start running trials in a queue (new trials will start as resources are freed)
    futures = [evaluate.remote(parameters) for parameters in parameters_batch]

    # wait for all trials in the batch to complete before continuing (i.e. blocking)
    results = ray.get(futures)

    # report the completion of trials to the Ax client
    for trial_index, raw_data in zip(trial_indices, results):
        ax_client.complete_trial(trial_index=trial_index, raw_data=raw_data)

I’m guessing specifying a generation strategy directly would address this, though I don’t know if there’s still that behavior of skipping a trial during the transition from Sobol to GPEI.

Issue Analytics

State:
Created 10 months ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

lena-kashtelyancommented, Dec 5, 2022

Under what conditions would optimization_complete evaluate to True? Perhaps a docstring update could help (happy to put in a PR based on the discussion here).

Great point, we should add that to the docstring! I think the conditions are: user actually set num_trials on the last step of their generation strategy to something other than -1 (I think this is only possible manually, not through choose_generation_strategy at the moment), search space was exhausted (applies to search spaces without range parameters), or optimization has completely converged and the model resuggested the same point more than some N times.

I need to make some changes to that function soon, so can augment docstring as part of that work! Or feel free to do that yourself, a PR will be much appreciated.

1reaction

sgbairdcommented, Dec 5, 2022

Ok, I think I’m seeing my misunderstanding. There are 5 Sobol trials, so the behavior described makes sense to me. I figured it was always 2 * num_parameters, but it seems like it’s max(2 * num_parameters, 5).

(sdl-demo) PS C:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo>  c:; cd 'c:\Users\sterg\Documents\GitHub\sparks-baird\self-driving-lab-demo'; & 'C:\Userisable logging, set the `verbose_logging` argument to `False`. Note that float values in the logs are rounded to 6 decimal points.[INFO 12-05 09:39:07] ax.service.utils.instantiation: Inferred value type of ParameterType.FLOAT for parameter x1. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.[INFO 12-05 09:39:07] ax.service.utils.instantiation: Inferred value type of ParameterType.FLOAT for parameter x2. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.[INFO 12-05 09:39:07] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='x1', parameter_type=FLOAT, range=[-5.0, 10.0]), RangeParameter(name='x2', parameter_type=FLOAT, range=[0.0, 15.0])], parameter_constraints=[]).[INFO 12-05 09:39:07] ax.modelbridge.dispatch_utils: Using Bayesian optimization since there are more ordered parameters than there are categories for the unordered categorical parameters.[INFO 12-05 09:39:07] ax.modelbridge.dispatch_utils: If `enforce_sequential_optimization` is False, max parallelism is not enforced and other max parallelism settings will be ignored.[INFO 12-05 09:39:07] ax.modelbridge.dispatch_utils: Using Bayesian Optimization generation strategy: GenerationStrategy(name='Sobol+GPEI', steps=[Sobol for 5 trials, GPEI for subsequent trials]). Iterations after 5 will take longer to generate due to  model-fitting.
[INFO 12-05 09:39:07] ax.service.ax_client: Generated new trial 0 with parameters {'x1': -1.523101, 'x2': 5.242057}.[INFO 12-05 09:39:07] ax.service.ax_client: Generated new trial 1 with parameters {'x1': 1.868107, 'x2': 8.988328}.2022-12-05 09:39:11,604 INFO worker.py:1518 -- Started a local Ray instance.
[INFO 12-05 09:39:16] ax.service.ax_client: Completed trial 0 with data: {'branin': (22.580185, None)}.
[INFO 12-05 09:39:16] ax.service.ax_client: Completed trial 1 with data: {'branin': (37.554675, None)}.
[INFO 12-05 09:39:16] ax.service.ax_client: Generated new trial 2 with parameters {'x1': -3.860303, 'x2': 4.642361}.
[INFO 12-05 09:39:16] ax.service.ax_client: Generated new trial 3 with parameters {'x1': 9.908869, 'x2': 6.520691}.
[INFO 12-05 09:39:16] ax.service.ax_client: Completed trial 2 with data: {'branin': (91.63373, None)}.
[INFO 12-05 09:39:16] ax.service.ax_client: Completed trial 3 with data: {'branin': (14.512183, None)}.
[INFO 12-05 09:39:16] ax.service.ax_client: Generated new trial 4 with parameters {'x1': 1.527272, 'x2': 8.386486}.
[INFO 12-05 09:39:16] ax.service.ax_client: Completed trial 4 with data: {'branin': (30.811008, None)}.
[INFO 12-05 09:39:17] ax.service.ax_client: Generated new trial 5 with parameters {'x1': 0.915877, 'x2': 2.359361}.
[INFO 12-05 09:39:17] ax.service.ax_client: Generated new trial 6 with parameters {'x1': 10.0, 'x2': 1.255361}.
[INFO 12-05 09:39:17] ax.service.ax_client: Completed trial 5 with data: {'branin': (21.098839, None)}.
[INFO 12-05 09:39:17] ax.service.ax_client: Completed trial 6 with data: {'branin': (4.997233, None)}.
[INFO 12-05 09:39:18] ax.service.ax_client: Generated new trial 7 with parameters {'x1': 8.266337, 'x2': 0.0}.
[INFO 12-05 09:39:18] ax.service.ax_client: Generated new trial 8 with parameters {'x1': 8.085169, 'x2': 15.0}.
[INFO 12-05 09:39:18] ax.service.ax_client: Completed trial 7 with data: {'branin': (8.94467, None)}.
[INFO 12-05 09:39:18] ax.service.ax_client: Completed trial 8 with data: {'branin': (187.981055, None)}.
[INFO 12-05 09:39:19] ax.service.ax_client: Generated new trial 9 with parameters {'x1': 8.986846, 'x2': 2.994546}.
[INFO 12-05 09:39:19] ax.service.ax_client: Completed trial 9 with data: {'branin': (2.050832, None)}.