Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"auto" `{min,max}_resource` heuristic could yield different results on different processes.

See original GitHub issue

Expected behavior

Estimated min_resource of the SuccessiveHalvingPruner and max_resource of HyperbandPruner using the "auto" argument, on different processes used under distributed optimization is expected to behave identically. As it’s currently implemented, although it in practice might be rare (see reproduction steps), could lead to unexpected behaviors. See https://github.com/optuna/optuna/pull/1171#issuecomment-625604963.

Steps to reproduce

Run distributed optimization with multiple processes with either of the two pruners given above using the "auto" resource argument. Make sure trials don’t always run for the same number of steps (e.g. the number of epochs could be a hyperparameter).
Note that the pruners in different processes could end up with different resources, as those are computed based on the number of steps of the completed trial(s).

Issue Analytics

State:
Created 3 years ago
Reactions:1
Comments:9 (8 by maintainers)

Top GitHub Comments

1reaction

hvycommented, May 19, 2020

Thanks for #1252, I took a look and had it merged.

I think that we should raise a warning when max_resource = “auto” and the step of the last intermediate value changes with each trial. What do you think?

That might be an option. At first, I was a bit concerned since having various number of steps between trials a valid use case. For instance, sampling learning rates is quite common and it’s reasonable to vary the number of steps in the same way. However, with HB and “auto” combined it might be a viable option, if we can do it without any significant overhead. With a naive approach, we probably iterate over all trials (similar to _try_initialization) in each call to prune, and although it won’t add to the time complexity as we’re doing the same thing in ASHA, I’m not sure if it’s worth just to raise a warning. I haven’t done any benchmarks though.

1reaction

hvycommented, May 11, 2020

Thanks for you input and example scenario. Yes actually just documenting it might be a reasonable decision, as doing a proper fix isn’t trivial.

Top Results From Across the Web

Faster min–max resource sharing in theory and practice

This generalizes and improves various previous results. We also prove other bounds and describe several speed-up techniques.

An Efficient Algorithm for the Knapsack Sharing Problem

The two versions of the algorithm yield satisfactory results within reasonable computational time. Extensive computational testing on problem instances taken ...

Fair Optimization and Networks: A Survey - Hindawi

This paper reviews fair optimization models and methods applied to systems that ... In practical applications, one can distinguish different variants of the ......

Routing for analog chip designs at NXP Semiconductors

Abstract. During the study week 2011 we worked on the question of how to automate certain aspects of the design of analog chips....

A Survey on Steiner Tree Construction and Global Routing for ...

often combined with other GR methods as a post-processing ... vias due to poor layer assignment may result in a routability/.