question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pass the model intializer into `Trainer.__init__`

See original GitHub issue

I think it could be helpful to add a model_initializer: Optional[Callable[[torch.nn.Module], None]] argument to the trainer.

If this argument is provided, then trainer would call model.apply(model_initializer) after the random seed is set, after deterministic mode is configured, and after Event.INIT fires (i.e. after surgery occurs), but before checkpoints are loaded. The advantages of initializing the model here (rather than relying on the user to initialize the model before passing it into the trainer) is that:

  1. the initializing would be deterministic w.r.t the random seed (so the user no longer needs to call composer.utils.reproducibility.set_random_seed() or composer.utils.reproducibility.configure_deterministic_mode() before creating the model and constructing the trainer)
  2. model layers replaced via surgery would be initialized.

This would not affect checkpoints, since checkpoint weights would override any initialized weights.

Thoughts?

cc: @jbloxham @ajaysaini725 @anisehsani @hanlint @A-Jacobson

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:6 (6 by maintainers)

github_iconTop GitHub Comments

1reaction
A-Jacobsoncommented, Feb 16, 2022

Alternatively, we can have the recommended way be to have the user call composer.utils.reproducibility, and then remove these arguments from the trainer’s __init__. However, I think this would make it slightly less intuitive, as now a user would need another import and know where to call these functions.

forgot to address this. I’d like to keep it on the trainer init if possible too (I don’t like the additional import) but only if it actually does what it says it does.

0reactions
mvpatel2000commented, Nov 3, 2022

Closing because we don’t plan on adding this

Read more comments on GitHub >

github_iconTop Results From Across the Web

Understanding Trainer.initialize: batch size x number of inputs?
When you call trainer.initialize(shape) , the shape you pass into it is the input shape that your model accepts.
Read more >
Initialize a model with 100 billions parameters in no time and ...
This way, you model can run for inference even if it doesn't fit on one of the GPUs or the CPU RAM! This...
Read more >
Trainer — PyTorch Lightning 1.8.5.post0 documentation
This might be useful if you want to collect new metrics from a model right at its initialization or after it has already...
Read more >
Pass 'this' object to an initialization list - c++ - Stack Overflow
If you need an instance object instead of a pointer, try: trainer::trainer() : myPokemon(*this) {}. Be careful if Charizard tries to call ...
Read more >
Layer weight initializers - Keras
Initializers define the way to set the initial random weights of Keras layers. The keyword arguments used for passing initializers to layers depends...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found