question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Load pretrained model except the head layer for a specific downstream task

See original GitHub issue

🚀 Feature request

It would be nice to have a flag for from_pretrained method that indicates whether to load last layer or not. This feature is needed for transfer learning.

Motivation

I have trained a model with a specific dataset for a downstream task. Now, I need to train another model that needs to be trained on a similar dataset with different labels. I know that previous model have learned the features from the previous dataset and the new model doesn’t need to start from scratch. When I try to load the first model with from_pretrained method, it returns size mismatch error due to last layer that has different shape for different number of labels. If there is a flag to load/not to load the last layer, I can initialize last layer randomly and go on my training with transfer learning.

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
LysandreJikcommented, Apr 7, 2021

@vimarshc this issue has not been addressed elsewhere. Feel free to draft a proposal in an issue/PR so that we can take a look and discuss! Thank you!

1reaction
vimarshccommented, Apr 6, 2021

Hi @LysandreJik, Is this issue being addressed elsewhere? If not, would like to work on it.

Read more comments on GitHub >

github_iconTop Results From Across the Web

How do I change the classification head of a model?
You have to remove the last part ( classification head) of the model. ( BERT base uncased + Classification ) = new Model...
Read more >
Adding Custom Layers on Top of a Hugging Face Model
Some models on Hugging Face are trained on downstream tasks like question-answering or text classification and contain knowledge about the data they were ......
Read more >
Transfer Learning for Computer Vision Tutorial - PyTorch
We will use torchvision and torch.utils.data packages for loading the data. ... Load a pretrained model and reset final fully connected layer.
Read more >
Pretrained transformer framework on pediatric claims data for ...
Experimental results on two downstream tasks demonstrated the ... pre-training framework outperformed tailored task-specific models, ...
Read more >
How much does pre-trained information help? Partially re ...
of pre-trained knowledge for each layer on downstream tasks. ... model is to load it with pre-trained weights and then train this model...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found