Sacred Workflows
See original GitHub issueIn an attempt to structure our discussion I suggest to use this issue to collect a wishlist of how we would like to use Sacred from a birds-eye perspective. I suggest that we edit this issue to reflect the evolving consensus that (hopefully) emerges from the discussion below. To get things started I can think of 3 basic workflows, that I would love for sacred to support. Maybe this is also a good place to think about how to integrate stages and superexperiments.
Interactive (Jupyter Notebook)
Manually control the stages of the experiment / run in an interactive environment. Most suitable for exploration and low complexity experiments. Something like:
# -----------------------------------------------------------
# initialization
ex = Experiment('my_jupyter_experiment')
ex.add_observer(FilestorageObserver('tmp'))
# -----------------------------------------------------------
# Config and Functions
cfg = Configuration()
cfg.learn_rate = 0.01
cfg.hidden_sizes = [100, 100]
cfg.batch_size = 32
@ex.capture
def get_dataset(batch_size):
....
# -----------------------------------------------------------
# run experiment
ex.start() # finalizes config, starts observers
data = get_dataset() # call functions
for i in range(1000):
# do something
ex.log_metric('loss', loss) # log metrics, artifacts, etc.
ex.stop(result=final_loss)
# -----------------------------------------------------------
Scripting
Using a main script that contains most of the experiment and is run from the commandline. This is the current main workflow, most suitable for low to medium complexity experiments.
ex = Experiment('my_experiment_script')
@ex.config
def config(cfg):
cfg.learn_rate = 0.01
...
@ex.capture
def get_dataset(batch_size):
....
@ex.automain # define a main function which automatically starts and stops the experiment
def main():
.... # do stuff, log metrics, etc.
return final_loss
Object Oriented
This is a long-standing feature request #193. Define an experiment as a class to improve modularity (and support frameworks like ray.tune). Should cater to medium to high complexity experiments. Very tentative API sketch:
class MyExperiment(Experiment):
def __init__(self, config=None): # context-config to deal with updates and nesting
super().__init__(config)
self.learn_rate = 0.001 # using self to store config improves IDE support
...
def get_dataset(self): # no capturing because self gives access to config anyways
return ...
@main # mark main function / commands
def main_function(self):
... # do stuff
return final_loss
ex = MyExperiment(config=get_commandline_updates())
ex.run()
Issue Analytics
- State:
- Created 4 years ago
- Comments:27 (6 by maintainers)
For “reinterpret the concept of calling a command”: It would be useful for any kind of parallel experiment, e.g., MPI
We may have to agree to disagree that this is the proper abstraction 😃 though it may certainly have its merits!
As hyperparameter optimization gets more sophisticated, it seems more reasonable to me to consider each hyperparameter optimization run an experiment itself that should be fully reproducible. From that perspective, each trial need not be a separate experiment with its own independent observers. Instead, observations should ideally be made at the level of the hyperparameter optimizer. The current ray tune design may be a bad fit for this though – I’m not sure.