Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Load pretrained model except the head layer for a specific downstream task

See original GitHub issue

🚀 Feature request

It would be nice to have a flag for from_pretrained method that indicates whether to load last layer or not. This feature is needed for transfer learning.

Motivation

I have trained a model with a specific dataset for a downstream task. Now, I need to train another model that needs to be trained on a similar dataset with different labels. I know that previous model have learned the features from the previous dataset and the new model doesn’t need to start from scratch. When I try to load the first model with from_pretrained method, it returns size mismatch error due to last layer that has different shape for different number of labels. If there is a flag to load/not to load the last layer, I can initialize last layer randomly and go on my training with transfer learning.

Issue Analytics

State:
Created 3 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

LysandreJikcommented, Apr 7, 2021

@vimarshc this issue has not been addressed elsewhere. Feel free to draft a proposal in an issue/PR so that we can take a look and discuss! Thank you!

1reaction

vimarshccommented, Apr 6, 2021

Hi @LysandreJik, Is this issue being addressed elsewhere? If not, would like to work on it.

Top Results From Across the Web

How do I change the classification head of a model?

You have to remove the last part ( classification head) of the model. ( BERT base uncased + Classification ) = new Model...

Adding Custom Layers on Top of a Hugging Face Model

Some models on Hugging Face are trained on downstream tasks like question-answering or text classification and contain knowledge about the data they were ......

Transfer Learning for Computer Vision Tutorial - PyTorch

We will use torchvision and torch.utils.data packages for loading the data. ... Load a pretrained model and reset final fully connected layer.

Pretrained transformer framework on pediatric claims data for ...

Experimental results on two downstream tasks demonstrated the ... pre-training framework outperformed tailored task-specific models, ...

How much does pre-trained information help? Partially re ...

of pre-trained knowledge for each layer on downstream tasks. ... model is to load it with pre-trained weights and then train this model...