Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Do the numeric values of ordinal ChoiceParameter choices affect the model?

See original GitHub issue

TL;DR

The docs seem to suggest that ordinal parameters use a Matern 5/2 kernel, so I assume the answer is “yes” it does affect it. Is there a way to change this to Hamming distance so that order constraints can be used with categorical variables? See toy problem below

Toy Problem

Take some data based on choices which are used to construct choice parameters (slots) and constraints. The choices can go into any of the slots, and choices are constrained to populate slots in a particular order (e.g. BAC is not allowed, only ABC). The script-version is given in ordinal_example.py.

Imports

import numpy as np
import pandas as pd

Choices and Data

data = [["A", "B", "C"], ["D", "C", "A"], ["C", "A", "B"], ["C", "B", "A"]]
choices = list(np.unique(np.array(data).flatten()))
n_choices = len(choices)

Ordinal Encoding

df = pd.DataFrame(data)
choice_lookup = {
    choice: choice_num for (choice, choice_num) in zip(choices, range(n_choices))
}
encoded_df = df.replace(choice_lookup)
encoded_choices = pd.DataFrame(choices)[0].map(choice_lookup).values
encoded_data = encoded_df.values
print(encoded_data)

[[0 1 2]
 [3 2 0]
 [2 0 1]
 [2 1 0]]

Choice Parameters

nslots = 3
slot_names = ["slot_" + str(i) for i in range(nslots)]
slots = [
    {
        "name": slot_name,
        "type": "choice",
        "values": encoded_choices,
    }
    for slot_name in slot_names
]
print(slots)  # then format via black

[
    {"name": "slot_0", "type": "choice", "values": ["A", "B", "C", "D"]},
    {"name": "slot_1", "type": "choice", "values": ["A", "B", "C", "D"]},
    {"name": "slot_2", "type": "choice", "values": ["A", "B", "C", "D"]},
]

Ordered Constraints

constraints = [
    lhs + " <= " + rhs for (lhs, rhs) in list(zip(slot_names[:-1], slot_names[1:]))
]
print(constraints)

["slot_0 >= slot_1", "slot_1 >= slot_2"]

Docs suggest ordinal parameters use Matern 5/2 kernel

Based on Support for mixed search spaces and categorical variables (docs):

The most common way of dealing with categorical variables in Bayesian optimization is to one-hot encode the categories to allow fitting a GP model in a continuous space. In this setting, a categorical variable with categories [“red”, “blue”, “green”] is represented by three new variables (one for each category). While this is a convenient choice, it can drastically increase the dimensionality of the search space. In addition, the acquisition function is often optimized in the corresponding continuous space and the final candidate is selected by rounding back to the original space, which may result in selecting sub-optimal points according to the acquisition function.

Our new approach uses separate kernels for the categorical and ordinal (continuous/integer) variables. In particular, we use a kernel of the form: k(x,y)=kcat(xcat,ycat)×kord(xord,yord)+kcat(xcat,ycat)+kord(xord,yord) For the ordinal variables we can use a standard kernel such as Matérn-5/2, but for the categorical variables we need a way to compute distances between the different categories. A natural choice is to set the distance is 0 if two categories are equal and 1 otherwise, similar to the idea of Hamming distances. This approach can be combined with the idea automatic relevance determination (ARD) where each categorical variable has its own lengthscale. Rather than optimizing the acquisition function in a continuously relaxed space, we optimize it separately over each combination of the categorical variables. While this is likely to result in better optimization performance, it may lead to slow optimization of the acquisition function when there are many categorical variables.

It seems like ordinal variables will use a Matérn-5/2 kernel by default, in which case I’d assume the numeric choices of the ordinal parameters to play a significant role. Is this the case? How do I replace this with a Hamming distance instead? Is this a flag that could be incorporated into e.g. ax_client.create_experiment() or the other APIs?

Issue Analytics

State:
Created 2 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

2reactions

Balandatcommented, Dec 10, 2021

For the ordinal variables we can use a standard kernel such as Matérn-5/2, but for the categorical variables we need a way to compute distances between the different categories. A natural choice is to set the distance is 0 if two categories are equal and 1 otherwise, similar to the idea of Hamming distances.

So this is for ordered categorical variables. For unordered ones we do indeed use the hamming distance.

I’m assuming integer parameters probably use a Matern-like kernel by default rather than a Hamming-like kernel. While not necessarily straightforward for me, it seems possible to swap out the Matern kernel with the Hamming kernel. However, I’m not sure how this could be done only for certain parameters (e.g. only int parameters) while leaving the rest of the parameters (e.g. all float-s) to the default Matern kernel

We pass down some minimal representation (a SearchSpaceDigest) into the modelbridge layer based on which we choose what kind of model/kernel to use. E.g. if there are unordered categoricals we will end up in this branch that chooses a model that uses both a Matern and a Hamming distance kernel:
https://github.com/facebook/Ax/blob/65dc4945d2988bc67b47320cb4d769c09f150811/ax/models/torch/botorch_modular/utils.py#L105-L115

Currently this happens automatically based on the parameter type, i.e., there isn’t an easy way right now to use a hamming distance kernel for an integer parameter.

Since these are rather complex constraints, I wanted to resurface my previous comment:

Is the application here essentially to select and ordered subset of elements from some ordered list? it seems that in such cases with rather complex constraints it might make sense to think about a more custom way of optimizing this rather than trying to express it in the existing parameter & constraint interface.

It may make the most sense to do something custom here, e.g. using a heuristic mixed-discrete optimization strategy operating directly on the set of feasible orderings in the slot. Something like this could potentially be achieved by passing in some callable that just serves as an feasibility check for whether a given slot configuration is feasible, or it may just itself generate the set of feasible orderings.

1reaction

lena-kashtelyancommented, Dec 9, 2021

cc @qingfeng10 (as modopt oncall) and @Balandat