question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Hyperparameter Optimization using multiple GPUs on a single Host

See original GitHub issue

I am currently tring to set up a Hyperparameter Optimization using multiple GPUs on a single Host. I followed and implemented this tutorial: https://keras-team.github.io/keras-tuner/tutorials/distributed-tuning/

The Optimization works as expected, but I can not distribute across multiple GPUs on a single Host using following Bash file:

export KERASTUNER_TUNER_ID="chief"
export KERASTUNER_ORACLE_IP="127.0.0.1"
export KERASTUNER_ORACLE_PORT="8000"
python hp_test.py  &> chief.txt & 
export chief=$!

export KERASTUNER_TUNER_ID="tuner0"
python hp_test.py  &> t0.txt & 
export t0=$!

while kill -0 $chief && kill -0 $t0 
do
    r=$'\r'
    now="$(date +'%Y-%m-%d %H:%M:%S')"
    printf "${r}${now}: Alive)"
    sleep 1
done

I have 3 questions:

  1. Is my Bash-file wrong this is the reason why I can not start the optimization?
  2. In issues 329`, it seems as if it is not possible to distribute hyperparameter optimization across multipe GPUs on one system using Keras-tuner. Is this correct?
  3. if it is possible to distribute the optimization across multiple gpus on one system, are there any more in depth Tutorials on how to set this up. As fas as I can tell, you also need oracle but I couldnt find any documentation on how to set it up for multi-gpu distribution. (which dependencies, execution…)

Thank you very much!

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

4reactions
CarlPoiriercommented, Feb 23, 2021

I need this as well!

0reactions
BenK3nobicommented, Jul 18, 2022

I am stumbling on the same issue. I tried to get the multi GPU tuning on the same host running. Any news / advices so far?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Automate Hyperparameter Tuning Across Multiple GPU - Run:AI
In this post, we will review how hyperparameters and hyperparameter tuning plays an important role in the design and training of machine learning...
Read more >
The importance of hyperparameter tuning for scaling deep ...
When moving from training on a single GPU to training on multiple GPUs, a good heuristic is to increase the mini-batch size by...
Read more >
Multi-GPU Hyperparameter Sweeps in Three Simple Steps
Hyperparameter sweeps are ways to automatically test different configurations of your model. They address a wide range of needs, including  ...
Read more >
PyTorch Lightning and Optuna: Multi-GPU hyperparameter ...
In this piece I would like to share my experience of using PyTorch Lightining and Optuna, a python library for automated hyperparameter tuning....
Read more >
Efficient Training on Multiple GPUs - Hugging Face
Switching from a single GPU to multiple requires some form of parallelism as the work needs to be distributed. There are several techniques...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found