question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Setup Github Actions workflow CI for horovod on CPU

See original GitHub issue

🚀 Feature

Idea is to add run Horovod tests on CPU via Github Actions. We can either: 1) install horovod using pip or 2) otherwise compile horovod from source as it is done for Circle CI (without NCCL support). The goal is to be able to run CPU distributed tests with Horovod backend.

TODO:

Quick play around github actions can be done in a separate repository like https://github.com/vfdev-5/github-actions-playground

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
ramesht007commented, Oct 2, 2020

hey @sdesrozis , I am working on this. will give a PR in few days. 🙂

1reaction
vfdev-5commented, Oct 7, 2020

Our goal is to run distributed tests for CPU, and the tests reffered to are in circleCI which use docker. So i have to use docker for the same?

@ramesht007 in circle ci we use docker env to easily have pytorch distribution without installing conda etc as it is done in github actions. For github actions, let’s follow the same setup as https://github.com/pytorch/ignite/blob/master/.github/workflows/unittests.yml#L29 , Setup Miniconda and Install dependencies steps. Let’s use python 3.7 and the latest pytorch version.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Add GitHub Workflow CI · horovod/horovod@0974e3b · GitHub
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. - Add GitHub Workflow CI · horovod/horovod@0974e3b.
Read more >
horovod/ci.yaml at master - GitHub
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. - horovod/ci.yaml at master · horovod/horovod.
Read more >
horovod/Dockerfile.test.cpu at master - GitHub
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. - horovod/Dockerfile.test.cpu at master · horovod/horovod.
Read more >
horovod/install.rst at master - GitHub
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. - horovod/install.rst at master · horovod/horovod.
Read more >
intel/intel-horovod: Distributed training framework for ... - GitHub
To run on CPUs: $ pip install horovod. To run on GPUs with NCCL: $ HOROVOD_GPU_OPERATIONS=NCCL pip install horovod. For more details on...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found