question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to properly save and load an experiment

See original GitHub issue

I have a modified version of this https://botorch.org/tutorials/custom_botorch_model_in_ax

where I have saved the experiment after each call to get_botorch.

        for i in range(len(exp.trials.values()), num_bo_trails+2):
            print('Running optimization batch {}/{}'.format(i+1, num_bo_trails))
            model = get_botorch(experiment=exp, data=exp.eval(), search_space=exp.search_space,
                                model_constructor=_get_and_fit_gp)

            save(exp, args.bo_save_path)
            batch = exp.new_trial(generator_run=model.gen(1))

If that loop gets interupted, I want to be able to reload the experiment and restart the loop from where it left off. However I get his issue:

File “Torch1venv/venv/lib/python3.6/site-packages/ax/core/observation.py”, line 189, in observations_from_data obs_parameters = experiment.arms_by_name[features[“arm_name”]].parameters.copy() KeyError: ‘0’

After the first get_botorch call after I try to load up again.

Also I noticed that the trail status always seems to be ‘status=TrialStatus.RUNNING’ and never completed? Do I manually need to set trials to completed?

Thanks.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
ldworkincommented, May 13, 2019

Hey @arvieFrydenlund – sorry, I forgot to mention this! It’s a simple fix on your end. After you load the experiment from the json file, you’ll just need to re-set the evaluation function, e.g.

exp = load(bo_save_path)
exp.evaluation_function = run

We don’t store evaluation functions, since function serialization is a difficult problem. We should make this more clear though 😃

1reaction
lena-kashtelyancommented, May 7, 2019

Hey, @arvieFrydenlund, we tried to repro the bug you are getting, and coudn’t get the same issue to come up. Would you mind sharing your full notebook?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Save, Load and Resume Experiments - garage - Read the Docs
To perform the save, load and resume operations of an experiment, ... Congratulation, you successfully load a pre-trained model and start a new...
Read more >
Better Saving and Logging for Research Experiments
It makes it easy to spot-check (via ls -lh on the command line) which experiment runs can be safely deleted if major updates...
Read more >
Save and Load Machine Learning Models in Python with scikit ...
In this post you will discover how to save and load your machine learning model in Python using scikit-learn. This allows you to...
Read more >
How to Save and Load Models in PyTorch - Wandb
This article is a tutorial that covers how to correctly save and load your trained machine learning models in PyTorch using Weights &...
Read more >
Machine Learning Experiment Management: How to Organize ...
Machine learning or deep learning experiment tracking is a key factor in ... Pickle remains the most popular way to save and load...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found