Migrate tensor parallelism code to use OSLO
See original GitHub issueIs your feature request related to a problem? Please describe. Would be good to remove the megatron tensor parallelism code from NeoX, and OSLO currently has support for this, and a slightly nicer interface.
Describe the solution you’d like
Steps:
- Rewrite all current modules as plain pytorch implementations, removing the
mpu
dependency from any internal code as much as possible. (so, anything that’s currently anmpu.[Column|Row]ParallelLinear
ormpu.VocabParallelEmbedding
should be replaced with its plain pytorch equivalent (nn.Linear
/nn.Embedding
respectively). - Write a mapping for neox modules, which oslo uses to handle parallelization.
- Ensure backwards compatibility
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
Efficient Training on Multiple GPUs - Hugging Face
Switching from a single GPU to multiple requires some form of parallelism as the work needs to be distributed. There are several techniques...
Read more >Migrate to TensorFlow 2
Learn how to migrate your TensorFlow code from TensorFlow 1.x to TensorFlow 2. It may take a little work to convert your code, ......
Read more >Parallel Binary Code Analysis - arXiv
The core of this work is a new parallel analysis for constructing control flow graphs (CFG construction), which constructs functions, basic.
Read more >Scaling deep learning workloads with PyTorch / XLA and ...
In our model code, we use PyTorch / XLA's optimizer_step(optimizer) to calculate the gradients and initiate this synchronous update.
Read more >Package List — Spack 0.20.0.dev0 documentation
This is a list of things you can install using Spack. It is automatically generated based on the packages in this Spack version....
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I will actively support this work.
@sdtblck Did you check my branch?