[Question] TrialStatus not marked as completed when using fetch data. `trial.status.expecting_data` is True
See original GitHub issueHello,
I have some doubts about the BO procedure of appending new trials, running them, fetching the data and updating.
First, after doing experiment.fetch_data()
, the trial.status.expecting_data
value is still True
, which makes the trial run once again every time experiment.fetch_data()
is ran.
In my case I have the following procedure:
-
Define quasirandom model
sobol = get_sobol(exp.search_space)
-
I have this function
def model_loop(model, batch_size, experiment):
"""
Makes a loop of random generation and fetching
"""
if batch_size > 1:
n = batch_size
_ = experiment.new_batch_trial(
generator_run=model.gen(n=n)).run()
else:
n = 1
_ = experiment.new_trial(
generator_run=model.gen(n=n)).run()
data = experiment.fetch_data()
return experiment, model, data
Make a pass of sobol model of model_loop
function:
exp, sobol, data = model_loop(
sobol,
batch,
experiment)
- Define botorch model and use botorch model the same way, in a loop like
for i in range(n):
exp, botorch, trial_number, data = model_loop(
get_botorch(experiment=exp, data=data),
batch_size,
exp)
The thing is that the trials are not marked as TrialStatus.COMPLETED
after calling experiment.fetch_data()
. Should I mark them myself? Also, although I mark them as completed, its status of expecting_data
is True
, hence every time fetch_data
is run ALL, and literally all trials are ran once again. How can this be solved?
I see in class TrialStatus
that the property expecting_data
is True for both TrialStatus
RUNNING
and COMPLETED
. Shouldn’t it be false for COMPLETED
?
Maybe I have missed something because I do not understand the current behaviour.
Thanks
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (3 by maintainers)
Top GitHub Comments
Hi @BCJuan ! A few things:
You’re correct that
experiment.fetch_data()
doesn’t mark trials as completed. That’s because the Developer API is built to be as flexible as possible, and so you have to explicitly manage the lifecycle of your trials yourself. So yup, go ahead and mark them completed yourself.The behavior of
expecting_data
is correct. Again, for maximum flexibility in the developer API, we want it to be possible to fetch data for completed trials as well as running trials. This is desired for several applications, like A/B testing.Can you give us more information about how your data fetching works? That might help us figure out how best to support your use case. You might also want to take a look at the Service or Loop APIs, which seem like it might be better suited to you, but we’ll have a better sense of that when we know how your metrics are implemented.
cc @lena-kashtelyan
Yes, I am unblocked for now. Thank you very much. I think that is appropriate to close the issue.