[feature request] a tool to clone existing models to make new models with small changes
See original GitHub issueš Feature request
So we have great templates for creating a new model.
Can you think of a way to create full clones of existing models?
Practically for BigScience needs we will have to create something like GPTMeg which is 99.9% identical to GPT2 with 2-3 tiny changes. And then we will need another GPT2 variant that replaces Positional Embeddings with ALiBi. And there will be more variants.
Using templates would be quite expensive, when always everything is really identical.
So ideally a user will do:
transformers-clone-model GPT2 GPTMeg
and voila itād replicate modelās files, tests and docs.
If all source files could be easily identified this perhaps could be done in a few perl one liners. Here is a very rough outline:
- find the pertinent source files grep -Irl GPT2 .
- rename files/dirs while copying s/gpt2/gpt_meg/
- rename internals to s/GPT2/GPTMeg/g
The hard to automate part is the index files as they is only one of each
I think I can work it out, but Iām afraid that the end result would be a set of Perl one-liners only Stas will know what to do with. So perhaps long term this is not a good solution.
Here is the Issue where we need to implement this: https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/138 and 2 more will be coming soon.
Issue Analytics
- State:
- Created 2 years ago
- Reactions:1
- Comments:10 (10 by maintainers)
Thatās an interesting feature request, would be very useful indeed! Could provide a better starting point than the templates in many situations.
I can work on this a bit next week once I have re-enabled the doc styler. I donāt promise to have something fully finished before I go on vacation (first week of January) however.