Pass the model intializer into `Trainer.__init__`
See original GitHub issueI think it could be helpful to add a model_initializer: Optional[Callable[[torch.nn.Module], None]]
argument to the trainer.
If this argument is provided, then trainer would call model.apply(model_initializer)
after the random seed is set, after deterministic mode is configured, and after Event.INIT fires (i.e. after surgery occurs), but before checkpoints are loaded.
The advantages of initializing the model here (rather than relying on the user to initialize the model before passing it into the trainer) is that:
- the initializing would be deterministic w.r.t the random seed (so the user no longer needs to call
composer.utils.reproducibility.set_random_seed()
orcomposer.utils.reproducibility.configure_deterministic_mode()
before creating the model and constructing the trainer) - model layers replaced via surgery would be initialized.
This would not affect checkpoints, since checkpoint weights would override any initialized weights.
Thoughts?
cc: @jbloxham @ajaysaini725 @anisehsani @hanlint @A-Jacobson
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
Understanding Trainer.initialize: batch size x number of inputs?
When you call trainer.initialize(shape) , the shape you pass into it is the input shape that your model accepts.
Read more >Initialize a model with 100 billions parameters in no time and ...
This way, you model can run for inference even if it doesn't fit on one of the GPUs or the CPU RAM! This...
Read more >Trainer — PyTorch Lightning 1.8.5.post0 documentation
This might be useful if you want to collect new metrics from a model right at its initialization or after it has already...
Read more >Pass 'this' object to an initialization list - c++ - Stack Overflow
If you need an instance object instead of a pointer, try: trainer::trainer() : myPokemon(*this) {}. Be careful if Charizard tries to call ...
Read more >Layer weight initializers - Keras
Initializers define the way to set the initial random weights of Keras layers. The keyword arguments used for passing initializers to layers depends...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
forgot to address this. I’d like to keep it on the trainer init if possible too (I don’t like the additional import) but only if it actually does what it says it does.
Closing because we don’t plan on adding this