question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Problem with multirun in hydra

See original GitHub issue

Hello 👋!

I am facing some issues when launching my experimentations with hydra multirun hydra -m/--multirun.

First, I noticed the configuration (omegaconf) was not visible in the dashboard when multirun is used. To reproduce:

# main.py
import hydra
from clearml import Task
from omegaconf import DictConfig, OmegaConf

@hydra.main(config_name="config")
def my_app(cfg : DictConfig) -> None:
    print(OmegaConf.to_yaml(cfg))
    task = Task.init(project_name='hydra-clearml', task_name='hydra-clearml', reuse_last_task_id=False)

if __name__ == "__main__":
    my_app()
pip install hydra-core clearml
python main.py +test.test=1  # omegaconf logged
python main.py -m +test.test=1,2,3,4  # omegacong not captured

I wanted to force the configuration, so I added task.connect_configuration(OmegaConf.to_container(cfg, resolve=True)). I later realized it would not completely resolve the issue as all runs are stored under the same task and therefore only keeping the latest config. This is a plot taken from the task which has 4 runs. It is not a comparison between tasks, it’s the same task with all runs! Screenshot from 2021-02-08 12-11-26

Solution

I think the best solution (and expected behaviour) would be for clearml to create one task per hydra run. This solution has the benefit of enabling performance evaluation under several hyper parameters.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:8 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
antoinebrlcommented, Feb 12, 2021

It is looking good! Runs are broken down into several tasks and the omegaconf generated by hydra is correctly captured. Well done!

1reaction
bmartinncommented, Feb 10, 2021

I like the idea of the warning message which recommends creating and closing the task inside the app when multirun is activated.

No need to explicitly call close, only the Task.init() is important (the auto-close should be handled automatically if a multi-run is used)

Regrading the omegaconf issue, good news, I was able to reproduce the issue with multi-run. A fix is on its way 😉 I’ll update here soon

Read more comments on GitHub >

github_iconTop Results From Across the Web

Hydra multirun all permutations in parallel - Stack Overflow
The problem is that there are several houndreds of possible permutations and I want to test them all. So I am trying to...
Read more >
Output/Working directory - Hydra
Hydra solves the problem of your needing to specify a new output directory for each run, by creating a directory for each run...
Read more >
Complete tutorial on how to use Hydra in Machine Learning ...
Multi-run. This is a very useful feature of Hydra. Check the docs for more details. The main idea is you can run your...
Read more >
Hydra — A fresh look at configuration for machine learning ...
Hydra is an open-source Python framework developed at Facebook AI Research that solves the problems outlined in Part 1 (and a few others),...
Read more >
Run Boilerplate-Free ML Experiments with PyTorch Lightning ...
hydra -zen can be used to design a boilerplate-free Hydra application for running ... However, launching this task function in a multirun fashion...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found