question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients)

See original GitHub issue

For starters, my experience with Ax is running the Loop tutorial once and reading through some of the documentation such as the parameter types (i.e. fairly new). Also, I have some familiarity with Bayesian optimization.

The actual use-case is slightly different and more complicated, but I think the following is a suitable toy example. I go over the problem statement, some setup code, and possible solutions. Would love to hear some feedback.

Problem Statement

Take a composite material with the following class: ingredient combinations:

  • Filler: Colloidal Silica (filler_A)
  • Filler: Milled Glass Fiber (filler_B)
  • Resin: Polyurethane (resin_A)
  • Resin: Silicone (resin_B)
  • Resin: Epoxy (resin_C)

Take some toy data of components and their fractional prevalences (various combinations of fillers and resins, and various numbers of components) along with their objective (training data), and some model which takes arbitrary input parameters and predicts the objective (strength) which we wish to maximize.

For constraints, I’m thinking:

  • limit the total number of components in any given “formula” (e.g. max of 3 components)
  • naturally, that the compositions sum to 1 (or that abs(1-sum(composition)) <= tol)
  • there has to be at least one filler and at least one resin (if feasible)

Setup Code

To make it more concrete, it might look like the following:

choices = ["filler_A", "filler_B", "resin_A", "resin_B", "resin_C", "dummy"]

data = [
        [["filler_A", "filler_B", "resin_C"], [0.4, 0.4, 0.2]],
        [["filler_A", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
        [["filler_A", "filler_B", "resin_B"], [0.5, 0.3, 0.2]],
        [["filler_A", "resin_B", "dummy"], [0.5, 0.5, 0.0]],
        [["filler_B", "resin_C", "dummy"], [0.6, 0.4, 0.0]],
        [["filler_A", "filler_B", "resin_A"], [0.2, 0.2, 0.6]],
        [["filler_B", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
        ] # made-up data

def predict(objects, composition):
    ...
    return obj

Possible Solutions

One-hot-like prevalence encoding and components/composition

One-hot-like prevalence encoding

I’ve thought about trying to do a sort of “one-hot encoding” (assuming I’m using this term correctly), such that each component gets its own composition as a variable:

filler_A filler_B resin_A resin_B resin_C
0.4 0.4 0.2
0.6 0.0 0.2 0.2
0.5 0.3 0.2
0.5 0.5
0.6 0.4
0.2 0.2 0.6
0.6 0.2 0.2

which I think would look like the following:

best_parameters, values, experiment, model = optimize(
    parameters=[
        {
            "name": "filler_A",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "filler_B",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_A",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_B",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "resin_C",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    experiment_name="composition_test",
    objective_name="strength",
    evaluation_function=predict,
    parameter_constraints=["abs(1 - (filler_A + filler_B + resin_A + resin_B + resin_C)) <= 1e-6", "filler_A + filler_B > 0", "resin_A + resin_B + resin_C > 0"], # not sure if I can use `abs` here
    total_trials=30,
)

However, this could easily lead to compositions where all of the components have a finite prevalence and can be problematic from an experimental perspective.

components/composition

As I mentioned in the constraints, I’ve also thought about setting an upper limit to the number of components in a formula, which I think might look something like the following:

best_parameters, values, experiment, model = optimize(
    parameters=[
        {
            "name": "object1",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "object2",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "object3",
            "type": "choice",
            "bounds": choices,
        },
        {
            "name": "composition1",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "composition2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "composition3",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    experiment_name="composition_test",
    objective_name="strength",
    evaluation_function=predict,
    parameter_constraints=["abs(1 - (composition1 + composition2 + composition3)) <= 1e-6"],
    total_trials=30,
)

How would you suggest implementing this use-case in Ax? If it would help, I’d be happy to flesh this out into a full MWE or try out any suggestions. The real use-case involves ~100 different components across 4 different classes, and the idea is to (eventually) use this in an experimental adaptive design scheme.

(tag @ramz-i who is the individual in charge of this project in our research group, post here if you have anything to add)

#706

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:24 (23 by maintainers)

github_iconTop GitHub Comments

2reactions
sgbairdcommented, Dec 15, 2021

Here is a roadmap of some outstanding items as well as other features/topics that came up along the way. I’ll plan on updating these later if the status changes (along with an “EDIT” keyword). If I may, I’m also cc’ing the people who have been participating or tagged in these discussions to bring attention to the “birds-eye” issue. Thank you to everyone who has helped me out so far. Your responses have been incredibly useful and informative. Personally, it’s been very rewarding for me to get a better understanding of BO through the lens of a practical use case while using what I consider to be an excellent platform for it.

@lena-kashtelyan @Balandat @bernardbeckerman @eytan @saitcakmak @Ryan-Rhys @qingfeng10 @dme65 (@ramseyissa who is lead on the project)

n_components < max_components constraint

“use my own surrogate model”

Adding multiple outcome measurements for fixed parameters as separate trials

  • ✔️ simply add them as separate trials with no SEM specified assuming a low number of observations, otherwise, convert many observations to mean and SEM #752

Incorporate input/parameter uncertainty

For example, when you mix a bunch of components together, but there is some uncertainty in the final composition of each component (e.g. instrument resolution, losses during synthesis)

Multi-objective optimization (e.g. strength, hardness)

While I haven’t mentioned this one yet, we are interested in converting this to a MOO scheme. Note: this is different than what I mentioned above where I suggested using MOO to implement the n_components < max_components constraint. In this case, the MOO is for “real” outcomes (e.g. strength, hardness)

  • ✔️/❔ following the MOO tutorial should be fairly straightforward, but I’m not sure if MOO will be incompatible with any of the above-mentioned features
2reactions
sgbairdcommented, Dec 7, 2021

I adapted what I had from a Loop into a Service API and fixed some of the theory/understanding issues on my part in the linked example: https://github.com/facebook/Ax/issues/743#issuecomment-987778240 such that I’m generating a real suggested next_experiment. The main gap in my understanding is that a single evaluation of hartmann6 in the examples is like a wet-lab synthesis/characterization iteration for us.

I’m still struggling with the n_components < max_components constraint https://github.com/facebook/Ax/issues/745.

I’m also still confused about how I would replace the GPR surrogate model with my own (which has a built-in uncertainty output).

Read more comments on GitHub >

github_iconTop Results From Across the Web

optimization of chemistry reactions · Issue #706 · facebook/Ax · GitHub
sgbaird mentioned this issue on Nov 18, 2021. Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients) #727.
Read more >
Strategies for Fermentation Medium Optimization: An In-Depth ...
Based upon the obtained experimental data, optimization technique is used to predict a mathematical model and improve the medium composition.
Read more >
Using a mixture design and fraction-based formulation to better
Industrial pea-protein ingredients are traditionally generated via a several-step wet process. 70. Pea seeds are solubilized in an alkaline solution, which is ...
Read more >
A comprehensive linear programming tool to optimize ...
The LP tool may be adapted to create new formulations of other foods, such as supplementary foods, local foods, or foods high in...
Read more >
Optimizing Fractional Compositions To Achieve ... - ChemRxiv
Our approach shifts the optimization focus from model parameters to the fractions of each element in a composi- tion. Using a pretrained network ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found