question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. ItĀ collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[feature request] a tool to clone existing models to make new models with small changes

See original GitHub issue

šŸš€ Feature request

So we have great templates for creating a new model.

Can you think of a way to create full clones of existing models?

Practically for BigScience needs we will have to create something like GPTMeg which is 99.9% identical to GPT2 with 2-3 tiny changes. And then we will need another GPT2 variant that replaces Positional Embeddings with ALiBi. And there will be more variants.

Using templates would be quite expensive, when always everything is really identical.

So ideally a user will do:

transformers-clone-model GPT2 GPTMeg

and voila it’d replicate model’s files, tests and docs.

If all source files could be easily identified this perhaps could be done in a few perl one liners. Here is a very rough outline:

  1. find the pertinent source files grep -Irl GPT2 .
  2. rename files/dirs while copying s/gpt2/gpt_meg/
  3. rename internals to s/GPT2/GPTMeg/g

The hard to automate part is the index files as they is only one of each

I think I can work it out, but I’m afraid that the end result would be a set of Perl one-liners only Stas will know what to do with. So perhaps long term this is not a good solution.

Here is the Issue where we need to implement this: https://github.com/bigscience-workshop/Megatron-DeepSpeed/issues/138 and 2 more will be coming soon.

@LysandreJik, @patrickvonplaten, @sgugger

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:10 (10 by maintainers)

github_iconTop GitHub Comments

2reactions
LysandreJikcommented, Oct 18, 2021

That’s an interesting feature request, would be very useful indeed! Could provide a better starting point than the templates in many situations.

1reaction
sguggercommented, Dec 23, 2021

I can work on this a bit next week once I have re-enabled the doc styler. I don’t promise to have something fully finished before I go on vacation (first week of January) however.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Git Feature Branch Workflow | Atlassian Git Tutorial
A feature branch is a temporary branch used for development or testing purposes. Learn about the best way to manage them using this...
Read more >
About collaborative development models - GitHub Docs
Pull requests are useful in this model as they initiate code review and general discussion about a set of changes before the changes...
Read more >
Introduction to GitLab Flow
Tools such as GitLab and others choose the name ā€œmerge requestā€, because the final action is to merge the feature branch. This article...
Read more >
Migrate from ASP.NET Core 5.0 to 6.0 - Microsoft Learn
Learn how to migrate an ASP.NET Core 5.0 project to ASP.NET Core 6.0.
Read more >
Scaling Your Model or Parts of Your Model - SketchUp Help
Scale a single component or every component in your model. The following video shows how the Scale tool can scale geometry proportionally or...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found