question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

HPO not terminate as expect `max_experiment_duration`

See original GitHub issue

Describe the issue:

How to understand experiment.config.max_experiment_duration = "1m" option? The HPO process keeps running although reaching DDL.

I just use the example configuration:


SEARCH_SPACE = {
    "lr": {"_type": "qloguniform", "_value": [1e-4, 1, 1e-4]},
    "momentum": {"_type": "quniform", "_value": [0.1, 0.99, 0.01]},
    "weight_decay": {"_type": "qloguniform", "_value": [1e-5, 1e-2, 1e-5]},
    "batch_size": {"_type": "choice", "_value": [32, 64, 128, 256, 512]},
}

experiment = Experiment("local")
experiment.config.experiment_name = "Cifar Test"
experiment.config.trial_concurrency = 4
experiment.config.max_trial_number = 10
experiment.config.trial_gpu_number = 1
experiment.config.search_space = SEARCH_SPACE
experiment.config.max_experiment_duration = "1m"
experiment.config.trial_command = "python cifar.py"
experiment.config.tuner.name = "TPE"
experiment.config.tuner.class_args["optimize_mode"] = "maximize"
experiment.config.training_service.use_active_gpu = True
experiment.config.training_service.gpu_indices = [0, 1, 2, 3]

experiment.run(8080)

Environment:

  • NNI version: 2.9
  • Training service: local
  • Client OS: Unbuntu 20.04
  • Python version: 3.9
  • PyTorch/TensorFlow version: Pytorch 1.12
  • Is conda/virtualenv/venv used?: conda
  • Is running in Docker?: No

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:11 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
Lijiaoacommented, Oct 14, 2022

no_more_trial is not the same as the experiment status Running. It means tuner could not dispatch trial because this experiment had reached max duration 1min. So I suggest you set max_duration to 1h and re-run the experiment.

1reaction
Lijiaoacommented, Oct 14, 2022
Read more comments on GitHub >

github_iconTop Results From Across the Web

Popular titles are vanishing from HBO Max after merger - NPR
Discovery revealed this week that well-known titles like Westworld and The Time Traveler's Wife would be removed from HBO Max.
Read more >
A Novice's Guide to Hyperparameter Optimization at Scale |
I say at least 5x, because RS did not converge to the lower limit of the test MAE in the 8 hour limit....
Read more >
HBO Max same-day releases won't continue in 2022
So, the same-day release experiment for Warner Bros. and HBO Max will end after this year. However, the new agreement with Cineworld signals...
Read more >
Box Office-HBO Max Experiment by WarnerMedia Gets Mixed ...
After helping grow its sister streamer with immediate access to movies, Warner Bros. will abide by a 45-day exclusive window in 2022 ...
Read more >
HBO Max | Find out how to fix streaming issues on your TV.
On your Amazon Fire TV, go to Settings > Applications > Manage Installed Applications. Select HBO Max from the list of apps. Tap...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found