Relaxing `PreTrainedModel` requirement in _save
See original GitHub issue🚀 Feature request
It’s great to see that Trainer
is becoming flexible. Each functions seems to be more self contained now making inheritance easier. I’ve experimented with many custom models. For instance,
class Model(nn.Module):
def __init__(self, ..):
self.encoder = AutoModel.from_pretrained(..)
self.custom_modules = ..
def forward(self, **kwargs):
output = self.encoder(**kwargs)
# some custom operations
Many users are required to create custom models if they just don’t want simple SequenceClassification
head. In all cases, I have to override _save
method because of this line which explicitly puts a restriction on Trainer
to be used with models that inherit from PreTrainedModel
. It would be good to relax this requirement and give a warning about not using PreTrainedModel
instead.
Your contribution
I’ll open a PR if I get approval.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (7 by maintainers)
Top Results From Across the Web
Models - Hugging Face
PreTrainedModel takes care of storing the configuration of the models and handles methods for loading, downloading and saving models as well as a...
Read more >Host a Pretrained Model on SageMaker
For hosting, SageMaker requires that the deployment package be structured in a compatible format. It expects all files to be packaged in a...
Read more >ESPnet2 pretrained model, Shinji Watanabe ... - Zenodo
This model was trained by Shinji Watanabe using gigaspeech recipe in espnet. Python API See https://github.com/espnet/espnet_model_zoo
Read more >[PyTorch] 2. Model(x) vs Forward(x), Load pre-trained Model ...
Thus, in order to shorten this required time. How do we transfer tensors and models to GPU? : There are two ways. (1)...
Read more >Transformers Course - Chapter 3 - TF & Torch - Kaggle
But what if you want to fine-tune a pretrained model for your own ... This can save a lot of time and processing...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
After some internal discussion with @julien-c we will lower the requirement from
PreTrainedModel
to some lower abstractclass/protocol so the user knows exactly what they have to implement for their model to work seamlessly withTrainer
. I will work on this end of this week beginning of next.Sounds good. I’ll look forward to that part then.