[tune] Unexpected num_samples and trial execution (setting of self.config) behavior
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Kubuntu 18.04
- Ray installed from (source or binary): Installed binary via pip install
- Ray version: 0.7.6
- Python version: 3.6.8
- Exact command to reproduce:
Describe the problem
In reference to the example ray/tune/examples/hyperband_example.py In that example, it sets num_samples=20
If I instead set the Experiment num_samples=1, then:
- 1 MyTrainableClass instance is created
- _setup() is called once after self.config is set to a randomly generated config dict in that one MyTrainableClass instance
- _train() is called on that one instance many times, without any change to self.config in that one MyTrainableClass instance
If I instead set the Experiment num_samples=2, then:
- 2 MyTrainableClass instance are created, and for each instance:
- _setup() is called once after self.config is set to a randomly generated config dict in the MyTrainableClass instance
- _train() is called many times without any change to self.config in each MyTrainableClass instance _train() is called many times for the 2 instances during the experiment run
However, from: https://ray.readthedocs.io/en/latest/tune-usage.html it says: “E.g. in the above, num_samples=10 repeats the 3x3 grid search 10 times, for a total of 90 trials, each with randomly sampled values of alpha and beta”
but in my testing, the number of distinctly configured trials (i.e. distinctly different self.config dicts) is limited to num_samples. The issue I am having is that in my own MyTrainableClass class, for a given self.config dict of values, my _train() will always return exactly the same result, so there is no point in calling it more than once for the same self.config. I am expecting self.config will be different for each time _train() ends up getting called (presumably exactly once for each trial / MyTrainableClass class instance created).
My experiment setup (for XGBoost) looks like this:
hyperBandScheduler = HyperBandScheduler(
time_attr = "training_iteration",
metric = "episode_reward_mean",
metric = "mean_accuracy",
# Want to maximize mean_accuracy
mode = "max",
max_t = 100)
exp = Experiment(
resources_per_trial = {"cpu": 1, "gpu": 0},
stop={"training_iteration": 99999},
name = "xgb_auc_optimizer",
run = MyTrainableClass,
config={
"eval_metric": 'auc',
"booster" : sample_from(lambda spec: choice(["dart", "gbtree", "gblinear"])),
"max_depth" : sample_from(lambda spec: randint(1, 9)),
"eta" : sample_from(lambda spec: loguniform(1e-4, 1e-1)),
"gamma" : sample_from(lambda spec: loguniform(1e-8, 1.0)),
"grow_policy" : sample_from(lambda spec: choice(['depthwise', 'lossguide'])),
},
)
tune.run(exp,
scheduler = hyperBandScheduler,
verbose=0,
)
And _train() returns a dict containing a “mean_accuracy” value.
From my reading of the documentation, I thought that num_samples should be left at the default of 1, and the hyperparameter search iteration mechanism would create up to max_t=100 different MyTrainableClass instances - each with their own unique self.config - and each one having _setup() and then _train() called only once, with the overall goal of searching for an optimal (max) ‘mean_accuracy’.
I am not sure if what I have described is showing up an issue with ray/tune, or if I am not using it as intended. Questions:
-
Should I be using grid_search() rather than sample_from() for any of the search parameters in my config ?
-
Can anyone suggest either: a) What I need to change to be able to run an effective hyperparameter search for my MyTrainableClass class ? OR b) If what I have explained is showing up a bug in ray/tune ?
Thanks
Source code / logs
Issue Analytics
- State:
- Created 4 years ago
- Comments:8 (6 by maintainers)
Top GitHub Comments
This should be 45 different configs 😃
Thanks for opening this issue @andrewv99!
Ah, I can see that being confusing:
_train
should probably be better named as_step
. This will be called many times for one parameter (think of it like an epoch).If you’re only going to call _train once, you should set
stop={"training_iteration": 1}
.sample_from
generates one parameter everytime invoked.grid_search
will make it so that all values are evaluated.num_samples
will multiply the cardinality of the given specification.i.e.,
x: grid_search([1, 2, 3, 4]), num_samples=1
= 4 different configs.x: grid_search([1, 2, 3]), num_samples=1
= 3 different configs.x: grid_search([1, 2, 3]), num_samples=2
= 6 different configs.x: grid_search([1, 2, 3]), y: grid_search([a, b, c]), num_samples=1
= 9 different configs.x: grid_search([1, 2, 3]), y: grid_search([a, b, c]), num_samples=2
= 18 different configs.x: grid_search([1, 2, 3]), y: grid_search([a, b, c]), num_samples=5
= 90 different configs.x: sample_from(...), y: grid_search([a, b, c]), num_samples=2
= 6 different configs.x: sample_from(...), y: sample_from(..), num_samples=13
= 13 different configs.Feel free to ask any question that you have. Also, if this clears up your confusion, could you provide some suggestions as to what I can do to make this clearer from the docs?