Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

get_pareto_optimal_parameters(use_model_predictions=True) returns empty dictionary

See original GitHub issue

Hello, I am using the Service API for MOO with the default MOO model and I am unable to extract the Pareto optimal parameters with get_pareto_optimal_parameters(use_model_predictions=True). What could be the reason for that? The command returns me some parameterizations when I set use_model_predictions=False. However, they don’t seem to align with the plot that I have created using the compute_posterior_pareto_frontier with that same model. Do you know why that might be?

Issue Analytics

State:
Created a year ago
Comments:8 (6 by maintainers)

Top GitHub Comments

1reaction

esantorellacommented, Sep 26, 2022

The source of the bug is that the Pareto frontier is being computed on Y values in transformed space, but the thresholds remain in the original space (whether they are inferred or provided). Here’s a repro:

def f():
    NUM_SOBOL_STEPS = 5
    NUM_OF_ITERS = 10

    gs = GenerationStrategy(
        steps=[
            GenerationStep(model=Models.SOBOL, num_trials=NUM_SOBOL_STEPS),
            GenerationStep(model=Models.MOO, num_trials=-1),
        ]
    )

    ax_client = AxClient(
        generation_strategy=gs, random_seed=12345, verbose_logging=True
    )

    params = [
        {
            "name": name,
            "type": "range",
            "bounds": [0.1, 0.95],
            "value_type": "float",
        }
        for name in ["eta", "xi"]
    ]

    objective_names = ["stress_ratio", "stiffness_ratio"]
    ax_client.create_experiment(
        name="solid_hex_thick_opt",
        parameters=params,
        objectives={
            i: ObjectiveProperties(minimize=False)
            for i in objective_names
        },
        outcome_constraints=[],
    )

    # Y values range from 0 to 9
    for i in range(NUM_OF_ITERS):
        res = float(i)
        results = {"stress_ratio": res, "stiffness_ratio": res}
        _, trial_index = ax_client.get_next_trial()
        ax_client.complete_trial(trial_index, results)

    # Since data has been normalized, these values range from -1.46 to 1.46
    print(ax_client.generation_strategy.model.model.Ys)
    # returns an empty dictionary
    # Info logs tell us that it's inferring thresholds of 7.596 for each metric
    # Via pdb, I see that it's comparing the normalized Ys to the unnormalized threshold of 7.596
    print(ax_client.get_pareto_optimal_parameters(use_model_predictions=True))


if __name__ == "__main__":
    f()

In this example, the original Y values range from 0 to 9 for each metric. A threshold of 7.596 is inferred for each metric. But the Pareto frontier is being computed on normalized values, which range from -1.46 to 1.46. Since 1.46 is less than 7.596, none of the values qualifies for inclusion in the Pareto frontier.

1reaction

IgorKuszczakcommented, Apr 2, 2022

Hello @lena-kashtelyan, thanks for responding so quickly! My implementation is very similar to the one shown in the Service API tutorials. The main difference is that I evaluate the objective function in external software and bring the results back to Python through a pickled dictionary. Additionally, I made the code to support parallel evaluations with sequential batch creation. I previously mentioned that exact implementation in #879 resulted in poor coverage of the Pareto front. I am interested in finding parametrization where both objective values are higher than 1, so I set the thresholds to 0.9. That being said, I also tried running the same optimization with a threshold set to 0 and had the same issues - poor coverage of the Pareto front and inability to extract the results with get_pareto_optimal_parameters. Maybe, the two issues are related. I tried to include all the relevant information in the code snippet below.

## Bayesian Optimization in Service API
NUM_SOBOL_STEPS = 10
NUM_OF_ITERS = 100
BATCH_SIZE = 1 # running sequential

# Generation strategy
gs = GenerationStrategy(steps=
                        [GenerationStep(model=Models.SOBOL, num_trials=NUM_SOBOL_STEPS),
                         GenerationStep(model=Models.MOO, num_trials=-1)])

# Initialize the ax client
ax_client = AxClient(generation_strategy=gs, random_seed=12345, verbose_logging=True)

# # Define the parameters
params = [
        {
            "name": "eta",
            "type": "range",
            "bounds": [0.1, 0.95],
            "value_type": "float",
        },
        {
            "name": "xi",
            "type": "range",
            "bounds": [0.1, 0.95],
            "value_type": "float",
        }
    ]

# # Creating the experiment
ax_client.create_experiment(
    name="solid_hex_thick_opt",
    parameters=params,
    objectives={i: ObjectiveProperties(minimize=False, threshold=0.9) for i in ['stress_ratio', 'stiffness_ratio']},
    outcome_constraints=[])

## Instantiate the Simulation object whose get_results method is used to evaluate the trials in an external software
sim = Simulation(...)

# The get_results method returns a dictionary {metric_name1: value1, metric_name2: value2}

# # Initialize the variables used in the iteration loop
abandoned_trials_count = 0
NUM_OF_BATCHES = NUM_OF_ITERS // BATCH_SIZE if NUM_OF_ITERS % BATCH_SIZE == 0 else NUM_OF_ITERS // BATCH_SIZE + 1


for i in range(NUM_OF_BATCHES):
    try:
        results = {}
        trials_to_evaluate = {}

        # Sequentially generate the batch
        for j in range(min(NUM_OF_ITERS - i * BATCH_SIZE, BATCH_SIZE)):
            parameterization, trial_index = ax_client.get_next_trial()
            trials_to_evaluate[trial_index] = parameterization

        # Evaluate the results in parallel and append results to a dictionary
        for trial_index, parametrization in trials_to_evaluate.items():
            with concurrent.futures.ProcessPoolExecutor(max_workers=3) as executor:
                try:
                    exec = executor.submit(sim.get_results, parametrization)
                    results.update({trial_index: exec.result()})
                except Exception as e:
                    ax_client.abandon_trial(trial_index=trial_index)
                    abandoned_trials_count += 1
                    print(f'[WARNING] Abandoning trial {trial_index} due to processing errors.')
                    print(e)
                    if abandoned_trials_count > 0.1 * NUM_OF_ITERS:
                        print('[WARNING] More than 10 % of iterations were abandoned. Consider improving the parametrization.')

                for trial_index in results:
                    ax_client.complete_trial(trial_index, results.get(trial_index))

    except KeyboardInterrupt:
        print('Program interrupted by user')
        break

print(ax_client.get_pareto_optimal_parameters(use_model_predictions=True))

Top Results From Across the Web

Dictionary returned empty from function - Stack Overflow

I'm writing a function that takes in a fasta file that may have multiple sequences and returns a dictionary with the accession number...

Create Empty Dictionary in Python (5 Easy Ways) - FavTutor

In Python, we can use the zip() and len() methods to create an empty dictionary with keys. This method creates a dictionary of...

Ch 1 - projection returning empty dictionary - M121 - MongoDB

I'm on the computing fields lab. I've converted the title field into an array, I've set up a $cond to return single element...

Python: Check if a Dictionary is Empty (5 Ways!) - Datagy

Learn how to check if a Python dictionary is empty, ... We can see that both of these methods have returned empty dictionaries....

Python | Check if dictionary is empty - GeeksforGeeks

Sometimes, we need to check if a particular dictionary is empty or not. ... but here, passing an empty string returns a False, ......

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Start Free

Top Related Reddit Thread

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

get_pareto_optimal_parameters(use_model_predictions=True) returns empty dictionary

Issue Analytics

Top GitHub Comments

Top Results From Across the Web

Top Related Medium Post

Top Related StackOverflow Question

Troubleshoot Live Code

Top Related Reddit Thread

Top Related Hackernoon Post

Top Related Tweet

Top Related Dev.to Post

Top Related Hashnode Post

Are there asynchronous parallel examples?

Small and medium sized dataset suitability for Ax, with GPU computing. Is 15000 too many?