[tune] Hyperband Scheduler Not working
See original GitHub issueWhat is the problem?
Hyperband scheduler doesn’t work. Some experiments succeed but about 50% fail. I tried Ray version 0.7.2 and 0.9.0.dev0 running python 3.7 on Ubuntu 18.04.
The def update_trial_stats(self, trial, result)
function fails and here is the error:
Failure # 1 (occurred at 2020-02-02_14-45-58)
Traceback (most recent call last):
File "/home/kaleab/anaconda3/envs/research/lib/python3.7/site-packages/ray/tune/trial_runner.py", line 461, in _process_trial
self, trial, flat_result)
File "/home/kaleab/anaconda3/envs/research/lib/python3.7/site-packages/ray/tune/schedulers/hyperband.py", line 172, in on_trial_result
bracket.update_trial_stats(trial, result)
File "/home/kaleab/anaconda3/envs/research/lib/python3.7/site-packages/ray/tune/schedulers/hyperband.py", line 382, in update_trial_stats
assert delta >= 0
AssertionError
I have run validate_save_restore(trainable_class, config=validate_save_config)
and
validate_save_restore(trainable_class, config=validate_save_config, use_object_store=True)
and they both succeed.
Reproduction
Run hyperband on CIFAR 10, with the following config:
sched = HyperBandScheduler(
time_attr="training_iteration",
metric="accuracy",
mode="max",
max_t=ray_config['epochs'])
Trainable class similiar to https://github.com/ray-project/ray/blob/master/python/ray/tune/examples/mnist_pytorch_trainable.py, with the exact same save and restore methods.
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (2 by maintainers)
Top Results From Across the Web
Trial Schedulers (tune.schedulers) — Ray 2.2.0
HyperBandScheduler early stops trials using the HyperBand optimization algorithm. It divides trials into brackets of varying sizes, and periodically early stops ...
Read more >A Novice's Guide to Hyperparameter Optimization at Scale |
Hyperband (HB) is a scheduler designed to mitigate the SHA's bias towards initial performance. HB essentially loops over the SHA with a variety ......
Read more >[Tune] HyperBandScheduler throws TuneError for some ...
I've not been able to really find a pattern for which combinations of max_t and reduction_factor cause this. I've included one example below...
Read more >Ray Tune: How do schedulers and search algorithms ...
There is now a Bayesian Optimization HyperBand implementation in Tune - https://ray.readthedocs.io/en/latest/tune-searchalg.html#bohb.
Read more >AMA with Richard Liaw & Kai Fricke from Raytune
Use BO for large problems and a small number of hyperparameters. ... https://docs.ray.io/en/master/tune/api_docs/schedulers.html#tune-scheduler-hyperband .
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@semin-park currently investigating this!
BOHB is not working either due to the same reason.