Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] reusing existing experiment data.

See original GitHub issue

The data I’m trying to reuse is from orthogonal search subspaces, e.g. the complete search consists of [h1,h2,h3,h4], and the data I want to reuse is the result of searching in [h1,h4] and [h2,h3].

I attached all the existing data (>50 for each subspace) to a GPEI model and used it to generate new trials. Here are my questions.

Is this way of reusing data encouraged? If not, what is needed to achieve this goal?
When generating new trials, sometimes I get errors like NaNs encounterd when trying to perform matrix-vector multiplication and sometimes it successed. I think it may be related to the random seed of the bayesian model, is that right? How can I avoid this?
Training and visualizing w/ the reduced hparam/metric set is OK. However, when visualizing w/ the full hparam/metric set using existing data, it seems that I have to generate one trial with the bayesian model once. Is there any way to let the model be aware of the attached data w/o generating the next trial(which I think is memory consuming)?

Thanks!

Issue Analytics

State:
Created 4 years ago
Comments:5 (4 by maintainers)

Top GitHub Comments

2reactions

2timesjaycommented, Dec 4, 2019

@showgood163

If adding additional dimensions to your search space a single point on that dimension from past data doesn’t allow the model to estimate a Gaussian Process lengthscale for that dimension (likely what leads to numerical errors).

The right thing to do here would be to do a small re-initialization of the entire new space and then re-adding the old data.

If extending a dimension that’s already covers a range, then starting with the past data directly can work but initializing with a sobol search in the full new SearchSpace is always the most stable solution.

1reaction

lena-kashtelyancommented, Nov 21, 2019

@showgood163, didn’t mean to belabor what you already knew, was just answering the question of whether it’s possible to let the model be aware of the attached data w/out generating more trials : )

Also, just to make sure –– you needed to do fetch_data with an Experiment (which you retrieved from AxClient, not a SimpleExperiment, correct? For SimpleExperiment, eval is the correct function.