Reparameterizing compositional linear equality constraints as linear inequality constraints introduces Sobol sampling bias
See original GitHub issueI’ve been struggling with this concept for a week or two and decided to surface this to the Ax devs in a fresh issue to get some suggestions and as a sanity check.
@bernardbeckerman offered a very useful comment related to removing degenerate search space dimensions that I’ve been implementing:
Brief drive-by comment regarding the constraints in this problem: @sgbaird correct me if i’m wrong, but you really only have 4 free parameters in this case, since the requirement that they sum to 1 removes a degree of freedom. For this reason, it might be useful for you to set a constraint:
"filler_A + filler_B + resin_A + resin_B <= 1"
and then for each parameterization
resin_C = 1 - (filler_A + filler_B + resin_A + resin_B)
.
I described the issue related to a bias in Sobol sampling in https://github.com/facebook/Ax/issues/727#issuecomment-1081315512:
I realized that there is an issue with using Sobol sampling and the one-fewer-degrees-of-freedom style compositional constraint, and wanted to make a record of it in this issue.
Take the case of:
x1+x2+x3 == 1
reparameterized to:
x1+x2 <= 1
where
x3 == 1 - x1 - x2
. When the Sobol candidates are generated, it is completely unaware of the original representation involvingx3
during the sampling. If the bounds onx1
andx2
are[0,1]
, then it will preferentially favor the exploration of higher values ofx1
andx2
than that of the unseenx3
, if I’m not mistaken.
To illustrate, the bias in Sobol sampling towards the first two parameters might look something like the following, made-up data:
0.60, 0.30, 0.10
0.30, 0.60, 0.10
0.40, 0.40, 0.20
...
In https://github.com/facebook/Ax/issues/727#issuecomment-1086561847, I mentioned that retaining the original linear equality parameterization is probably the only way to prevent a bias in the Sobol sampling (and to some extent, possibly in the Bayesian iterations); however, this requires passing directly to BoTorch and manually implementing the underlying transforms https://github.com/facebook/Ax/issues/769#issuecomment-1024826291 @dme65.
For simple use-cases where anything/everything is allowed to range from [0, 1]
, it might be OK to ignore the transforms. For more complicated cases that I’m dealing with where I have different bounds for parameters (e.g. [0.1, 0.25]
) or the constraint isn’t relative to 1.0
(e.g. x_1 + x_2 <= 0.25 && x_1 + x2 >= 0.15
), I wonder if this will start causing issues.
Issue Analytics
- State:
- Created a year ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
I went back to the issue and it seems what you’re really trying to do is sample from the unit simplex here? The bias in the sampling is a known issue with the kind of approach you tried out. In BoTorch we actually implement proper sampling form the d-simplex: https://github.com/pytorch/botorch/blob/main/botorch/utils/sampling.py#L270. If the parameters don’t appear in other constraints then you can just sample the other dimensions independently and combine the samples (note that that will destroy the quasi-random low-discrepancy structure, but that’s probably fine for initialization; doing this kind of QMC sampling properly for non-box domains turns out to be hard / unsolved). If they do appear in other constraints then you can build a box with samples of the other parameters by adding dims to the components and then do rejection sampling based on the constraints.
Hooking this up into Ax would be somewhat challenging / require a good amount of effort since essentially we need to either automatically parse the constraints to infer that this is what the random initial step should be doing, or we have to introduces some new high level abstractions for specifying these constraints (in terms of the
ParameterConstraint
object and the language used to define constraints in the Service API).I think the easiest short term fix would be to just manually generate the initial random points using this
sample_simple
utility and then add them as custom arms to the experiment. Then you can use a standard generation strategy after that.Does this make sense?
You can also give [DelaunayPolytopeSampler]9https://github.com/pytorch/botorch/blob/21ce6c7fa9fa907674c37b849e2e5dc683ca2682/botorch/utils/sampling.py#L697-L715) a try - This does uses a pretty cool algorithm to uniformly sample from a general Polytope (it supports equality constraints as well) by subdividing the whole polytope into primitive shapes and then uses their volume to build a two-stage sample process. There is some expensive upfront computation (the computation of the convex hull) but if you need lots of samples this can be worth it. Note though that the complexity grows quickly here so if you are in higher dimensions or have lots of complex constraints this can quickly get intractable.