Automate selection of appropriate parameters for BoTorch components in Ax based on experiment and data size
See original GitHub issueHello again (I hope I am not causing too much trouble to the team 😃 ),
I am here to report a possible bug. Attaching many trials through the Service API consumes a lot of memory. Here you have an example: (Warning: A couple of machines started thrashing and then froze with the following code).
from ax.service.ax_client import AxClient
from ax.service.utils.instantiation import ObjectiveProperties
import itertools
def evaluate(args):
return {
'a': (5_000, 0.0),
}
ax_client = AxClient(random_seed=64)
ax_client.create_experiment(
name="ax_err",
parameters=[
{'name': 'p1', 'type': 'range', 'bounds': [0, 5000], 'value_type': 'int'},
{'name': 'p2', 'type': 'range', 'bounds': [0, 6000], 'value_type': 'int'},
{'name': 'p3', 'type': 'range', 'bounds': [0, 7000], 'value_type': 'int'},
],
objectives={
'a': ObjectiveProperties(minimize=True, threshold=10_000)
},
)
def range_float(stop, percent=10/100):
l = []
c = 0
while c < stop:
l.append(int(c))
c += percent * stop
return l
r_p1 = range_float(5000)
r_p2 = range_float(6000)
r_p3 = range_float(7000)
force_trials = []
for p1, p2, p3 in itertools.product(r_p1, r_p2, r_p3):
config, trial_index = ax_client.attach_trial({
'p1': p1,
'p2': p2,
'p3': p3
})
evaluations = evaluate(config)
ax_client.complete_trial(trial_index=trial_index, raw_data=evaluations)
for _ in range(15):
(config, trial_index) = ax_client.get_next_trial()
evaluations = evaluate(config)
ax_client.complete_trial(trial_index=trial_index, raw_data=evaluations)
I’ve replicated this issue with MOO. Reducing the percentage (e.g to 20/100) does not cause this behavior. If I had to guess, it does not appear to be a memory leak. I think that some internal operation of ax/botorch has very steep memory complexity.
Issue Analytics
- State:
- Created 2 years ago
- Comments:9 (9 by maintainers)
Top Results From Across the Web
Setup and Usage of BoTorch Models in Ax
If instantiated without one or both components specified, defaults are selected based on properties of experiment and data (see Appendix 2 for ...
Read more >Using BoTorch with Ax
Ax is a platform for sequential experimentation. It relies on BoTorch for implementing Bayesian Optimization algorithms, but provides higher-level APIs that ...
Read more >Wishlist: Tracking Issue #566 - facebook/Ax - GitHub
... Automate selection of appropriate parameters for BoTorch components in Ax based on experiment and data size ( Automate selection of ...
Read more >Open-sourcing new AI tools for adaptive experimentation
We are open-sourcing BoTorch and Ax, two new tools that leverage adaptive experimentation to ... managing, deploying, and automating adaptive experiments.
Read more >Facebook's Open Source Frameworks to Streamline PyTorch ...
Ax and BoTorch are two PyTorch experimentation frameworks created by Facebook ... managing, deploying, and automating adaptive experiments.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @josalhor, sorry for delay on this! Let me split up this issue into two:
To limit memory consumption of qNEI, the easiest way it probably to use our modular BotAx setup (so
Models.BOTORCH_MODULAR
instead ofModels.GPEI
that AxClient is using for you under the hood right now). To do so, you’ll need to:Models.BOTORCH_MODULAR
and pass acquisition function options to it (check out usingModels.BOTORCH_MODULAR
in generation strategies section of the modular BotAx tutorial for instructions);model_kwargs
for the BoTorch generation step, specify “acquisition_options” to something like{"optimizer_options": {"num_restarts": 10, "raw_samples": 256}}
,AxClient
viaAxClient(generation_strategy=...)
.You should end up with something like this:
Let us know if this doesn’t work for you! With this, I’ll consider part 1 of the issue resolved and will mark it as wishlist for part 2.
This is peak memory consumption adjusting the
percent
variable.This memory consumption comes in two phases. Here is an screenshot for the
15 / 100
entry:Here it is for the
12 / 20
one (manually killed):I’ve seen runs where the valley between the two phases is way less pronounced. It may be an issue that comes from both high memory consumption and Garbage Collection weirdness.