question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] reusing existing experiment data.

See original GitHub issue

The data I’m trying to reuse is from orthogonal search subspaces, e.g. the complete search consists of [h1,h2,h3,h4], and the data I want to reuse is the result of searching in [h1,h4] and [h2,h3].

I attached all the existing data (>50 for each subspace) to a GPEI model and used it to generate new trials. Here are my questions.

  1. Is this way of reusing data encouraged? If not, what is needed to achieve this goal?
  2. When generating new trials, sometimes I get errors like NaNs encounterd when trying to perform matrix-vector multiplication and sometimes it successed. I think it may be related to the random seed of the bayesian model, is that right? How can I avoid this?
  3. Training and visualizing w/ the reduced hparam/metric set is OK. However, when visualizing w/ the full hparam/metric set using existing data, it seems that I have to generate one trial with the bayesian model once. Is there any way to let the model be aware of the attached data w/o generating the next trial(which I think is memory consuming)?

Thanks!

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
2timesjaycommented, Dec 4, 2019

@showgood163

If adding additional dimensions to your search space a single point on that dimension from past data doesn’t allow the model to estimate a Gaussian Process lengthscale for that dimension (likely what leads to numerical errors).

The right thing to do here would be to do a small re-initialization of the entire new space and then re-adding the old data.

If extending a dimension that’s already covers a range, then starting with the past data directly can work but initializing with a sobol search in the full new SearchSpace is always the most stable solution.

1reaction
lena-kashtelyancommented, Nov 21, 2019

@showgood163, didn’t mean to belabor what you already knew, was just answering the question of whether it’s possible to let the model be aware of the attached data w/out generating more trials : )

Also, just to make sure –– you needed to do fetch_data with an Experiment (which you retrieved from AxClient, not a SimpleExperiment, correct? For SimpleExperiment, eval is the correct function.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Uses and Reuses of Scientific Data: The Data Creators ...
Our overarching research question asks what kinds of data reuse are made possible by access to public archives of scientific data and what...
Read more >
On the Reuse of Scientific Data - Data Science Journal
Lastly we propose six research questions on data reuse worthy of pursuit by the community: How can uses of data be distinguished from...
Read more >
Data Reuse Stories. Some concrete cases involving several ...
Data can be reused by others, for example, to ask different questions or to enrich them and build further applications or new products....
Read more >
An examination of data reuse practices within highly cited ...
The project goal was to better understand data reuse in practice and to explore if research data from an initial publication was reused...
Read more >
Understanding the process of data reuse: An extensive review
This study focuses on the general nature, occurrence patterns, and participating elements of data reuse and poses the following questions: RQ1.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found