question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

"auto" `{min,max}_resource` heuristic could yield different results on different processes.

See original GitHub issue

Expected behavior

Estimated min_resource of the SuccessiveHalvingPruner and max_resource of HyperbandPruner using the "auto" argument, on different processes used under distributed optimization is expected to behave identically. As it’s currently implemented, although it in practice might be rare (see reproduction steps), could lead to unexpected behaviors. See https://github.com/optuna/optuna/pull/1171#issuecomment-625604963.

Steps to reproduce

  1. Run distributed optimization with multiple processes with either of the two pruners given above using the "auto" resource argument. Make sure trials don’t always run for the same number of steps (e.g. the number of epochs could be a hyperparameter).
  2. Note that the pruners in different processes could end up with different resources, as those are computed based on the number of steps of the completed trial(s).

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:1
  • Comments:9 (8 by maintainers)

github_iconTop GitHub Comments

1reaction
hvycommented, May 19, 2020

Thanks for #1252, I took a look and had it merged.

I think that we should raise a warning when max_resource = “auto” and the step of the last intermediate value changes with each trial. What do you think?

That might be an option. At first, I was a bit concerned since having various number of steps between trials a valid use case. For instance, sampling learning rates is quite common and it’s reasonable to vary the number of steps in the same way. However, with HB and “auto” combined it might be a viable option, if we can do it without any significant overhead. With a naive approach, we probably iterate over all trials (similar to _try_initialization) in each call to prune, and although it won’t add to the time complexity as we’re doing the same thing in ASHA, I’m not sure if it’s worth just to raise a warning. I haven’t done any benchmarks though.

1reaction
hvycommented, May 11, 2020

Thanks for you input and example scenario. Yes actually just documenting it might be a reasonable decision, as doing a proper fix isn’t trivial.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Faster min–max resource sharing in theory and practice
This generalizes and improves various previous results. We also prove other bounds and describe several speed-up techniques.
Read more >
An Efficient Algorithm for the Knapsack Sharing Problem
The two versions of the algorithm yield satisfactory results within reasonable computational time. Extensive computational testing on problem instances taken ...
Read more >
Fair Optimization and Networks: A Survey - Hindawi
This paper reviews fair optimization models and methods applied to systems that ... In practical applications, one can distinguish different variants of the ......
Read more >
Routing for analog chip designs at NXP Semiconductors
Abstract. During the study week 2011 we worked on the question of how to automate certain aspects of the design of analog chips....
Read more >
A Survey on Steiner Tree Construction and Global Routing for ...
often combined with other GR methods as a post-processing ... vias due to poor layer assignment may result in a routability/.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found