Suggestions for implementing a composition-based optimization (i.e. fractional portion of ingredients)
See original GitHub issueFor starters, my experience with Ax is running the Loop tutorial once and reading through some of the documentation such as the parameter types (i.e. fairly new). Also, I have some familiarity with Bayesian optimization.
The actual use-case is slightly different and more complicated, but I think the following is a suitable toy example. I go over the problem statement, some setup code, and possible solutions. Would love to hear some feedback.
Problem Statement
Take a composite material with the following class: ingredient
combinations:
- Filler: Colloidal Silica (
filler_A
) - Filler: Milled Glass Fiber (
filler_B
) - Resin: Polyurethane (
resin_A
) - Resin: Silicone (
resin_B
) - Resin: Epoxy (
resin_C
)
Take some toy data of components and their fractional prevalences (various combinations of fillers and resins, and various numbers of components) along with their objective (training data), and some model which takes arbitrary input parameters and predicts the objective (strength) which we wish to maximize.
For constraints, I’m thinking:
- limit the total number of components in any given “formula” (e.g. max of 3 components)
- naturally, that the compositions sum to 1 (or that
abs(1-sum(composition)) <= tol
) - there has to be at least one filler and at least one resin (if feasible)
Setup Code
To make it more concrete, it might look like the following:
choices = ["filler_A", "filler_B", "resin_A", "resin_B", "resin_C", "dummy"]
data = [
[["filler_A", "filler_B", "resin_C"], [0.4, 0.4, 0.2]],
[["filler_A", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
[["filler_A", "filler_B", "resin_B"], [0.5, 0.3, 0.2]],
[["filler_A", "resin_B", "dummy"], [0.5, 0.5, 0.0]],
[["filler_B", "resin_C", "dummy"], [0.6, 0.4, 0.0]],
[["filler_A", "filler_B", "resin_A"], [0.2, 0.2, 0.6]],
[["filler_B", "resin_A", "resin_B"], [0.6, 0.2, 0.2]],
] # made-up data
def predict(objects, composition):
...
return obj
Possible Solutions
One-hot-like prevalence encoding and components/composition
One-hot-like prevalence encoding
I’ve thought about trying to do a sort of “one-hot encoding” (assuming I’m using this term correctly), such that each component gets its own composition as a variable:
filler_A | filler_B | resin_A | resin_B | resin_C |
---|---|---|---|---|
0.4 | 0.4 | – | – | 0.2 |
0.6 | 0.0 | 0.2 | 0.2 | – |
0.5 | 0.3 | – | 0.2 | – |
0.5 | – | – | 0.5 | – |
– | 0.6 | – | – | 0.4 |
0.2 | 0.2 | 0.6 | – | – |
– | 0.6 | 0.2 | 0.2 | – |
which I think would look like the following:
best_parameters, values, experiment, model = optimize(
parameters=[
{
"name": "filler_A",
"type": "range",
"bounds": [0.0, 1.0],
},
{
"name": "filler_B",
"type": "range",
"bounds": [0.0, 1.0],
},
{
"name": "resin_A",
"type": "range",
"bounds": [0.0, 1.0],
},
{
"name": "resin_B",
"type": "range",
"bounds": [0.0, 1.0],
},
{
"name": "resin_C",
"type": "range",
"bounds": [0.0, 1.0],
},
],
experiment_name="composition_test",
objective_name="strength",
evaluation_function=predict,
parameter_constraints=["abs(1 - (filler_A + filler_B + resin_A + resin_B + resin_C)) <= 1e-6", "filler_A + filler_B > 0", "resin_A + resin_B + resin_C > 0"], # not sure if I can use `abs` here
total_trials=30,
)
However, this could easily lead to compositions where all of the components have a finite prevalence and can be problematic from an experimental perspective.
components/composition
As I mentioned in the constraints, I’ve also thought about setting an upper limit to the number of components in a formula, which I think might look something like the following:
best_parameters, values, experiment, model = optimize(
parameters=[
{
"name": "object1",
"type": "choice",
"bounds": choices,
},
{
"name": "object2",
"type": "choice",
"bounds": choices,
},
{
"name": "object3",
"type": "choice",
"bounds": choices,
},
{
"name": "composition1",
"type": "range",
"bounds": [0.0, 1.0],
},
{
"name": "composition2",
"type": "range",
"bounds": [0.0, 1.0],
},
{
"name": "composition3",
"type": "range",
"bounds": [0.0, 1.0],
},
],
experiment_name="composition_test",
objective_name="strength",
evaluation_function=predict,
parameter_constraints=["abs(1 - (composition1 + composition2 + composition3)) <= 1e-6"],
total_trials=30,
)
How would you suggest implementing this use-case in Ax? If it would help, I’d be happy to flesh this out into a full MWE or try out any suggestions. The real use-case involves ~100 different components across 4 different classes, and the idea is to (eventually) use this in an experimental adaptive design scheme.
(tag @ramz-i who is the individual in charge of this project in our research group, post here if you have anything to add)
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:24 (23 by maintainers)
Top GitHub Comments
Here is a roadmap of some outstanding items as well as other features/topics that came up along the way. I’ll plan on updating these later if the status changes (along with an “EDIT” keyword). If I may, I’m also cc’ing the people who have been participating or tagged in these discussions to bring attention to the “birds-eye” issue. Thank you to everyone who has helped me out so far. Your responses have been incredibly useful and informative. Personally, it’s been very rewarding for me to get a better understanding of BO through the lens of a practical use case while using what I consider to be an excellent platform for it.
@lena-kashtelyan @Balandat @bernardbeckerman @eytan @saitcakmak @Ryan-Rhys @qingfeng10 @dme65 (@ramseyissa who is lead on the project)
n_components < max_components
constraintoutcome_constraint
as a workaround for constrainingn_components
#745n_components
https://github.com/facebook/Ax/issues/727#issuecomment-974513487 (rejecting based on the same notion as above)ChoiceParameter
https://github.com/facebook/Ax/issues/750#issuecomment-990723635 and https://github.com/facebook/Ax/issues/710#issuecomment-990681418int
RangeParameter
with a Hamming-like kernel, but only for thecomponent
parameters (i.e. keep a Matern-like kernel forcomposition
parameters) https://github.com/facebook/Ax/issues/750#issuecomment-990723635“use my own surrogate model”
Adding multiple outcome measurements for fixed parameters as separate trials
Incorporate input/parameter uncertainty
For example, when you mix a bunch of
components
together, but there is some uncertainty in the finalcomposition
of eachcomponent
(e.g. instrument resolution, losses during synthesis)ax_client.attach_trial
https://github.com/facebook/Ax/issues/751#issuecomment-990382162mean
andSEM
) that describes the exact (unknown)composition
https://github.com/facebook/Ax/issues/751#issuecomment-990398007outcome_sigma_propagated
andoutcome_sigma
via e.g. simple addition or law of total varianceMulti-objective optimization (e.g. strength, hardness)
While I haven’t mentioned this one yet, we are interested in converting this to a MOO scheme. Note: this is different than what I mentioned above where I suggested using MOO to implement the
n_components < max_components
constraint. In this case, the MOO is for “real” outcomes (e.g. strength, hardness)I adapted what I had from a Loop into a Service API and fixed some of the theory/understanding issues on my part in the linked example: https://github.com/facebook/Ax/issues/743#issuecomment-987778240 such that I’m generating a real suggested
next_experiment
. The main gap in my understanding is that a single evaluation ofhartmann6
in the examples is like a wet-lab synthesis/characterization iteration for us.I’m still struggling with the
n_components < max_components
constraint https://github.com/facebook/Ax/issues/745.I’m also still confused about how I would replace the GPR surrogate model with my own (which has a built-in uncertainty output).