Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Pass the model intializer into `Trainer.init`

See original GitHub issue

I think it could be helpful to add a model_initializer: Optional[Callable[[torch.nn.Module], None]] argument to the trainer.

If this argument is provided, then trainer would call model.apply(model_initializer) after the random seed is set, after deterministic mode is configured, and after Event.INIT fires (i.e. after surgery occurs), but before checkpoints are loaded. The advantages of initializing the model here (rather than relying on the user to initialize the model before passing it into the trainer) is that:

the initializing would be deterministic w.r.t the random seed (so the user no longer needs to call composer.utils.reproducibility.set_random_seed() or composer.utils.reproducibility.configure_deterministic_mode() before creating the model and constructing the trainer)
model layers replaced via surgery would be initialized.

This would not affect checkpoints, since checkpoint weights would override any initialized weights.

Thoughts?

cc: @jbloxham @ajaysaini725 @anisehsani @hanlint @A-Jacobson

Issue Analytics

State:
Created 2 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

A-Jacobsoncommented, Feb 16, 2022

Alternatively, we can have the recommended way be to have the user call composer.utils.reproducibility, and then remove these arguments from the trainer’s __init__. However, I think this would make it slightly less intuitive, as now a user would need another import and know where to call these functions.

forgot to address this. I’d like to keep it on the trainer init if possible too (I don’t like the additional import) but only if it actually does what it says it does.

0reactions

mvpatel2000commented, Nov 3, 2022

Closing because we don’t plan on adding this

Top Results From Across the Web

Understanding Trainer.initialize: batch size x number of inputs?

When you call trainer.initialize(shape) , the shape you pass into it is the input shape that your model accepts.

Initialize a model with 100 billions parameters in no time and ...

This way, you model can run for inference even if it doesn't fit on one of the GPUs or the CPU RAM! This...

Trainer — PyTorch Lightning 1.8.5.post0 documentation

This might be useful if you want to collect new metrics from a model right at its initialization or after it has already...

Pass 'this' object to an initialization list - c++ - Stack Overflow

If you need an instance object instead of a pointer, try: trainer::trainer() : myPokemon(*this) {}. Be careful if Charizard tries to call ...

Layer weight initializers - Keras

Initializers define the way to set the initial random weights of Keras layers. The keyword arguments used for passing initializers to layers depends...