question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Rethink usage pattern for pretrained models

See original GitHub issue

🚀 Feature

Switch to using SomeModel.from_pretrained('pretrained-model-name') for pretrained models.

Motivation

Seems we are following torchvision’s pattern of having a ‘pretrained’ argument in the init of our models to initialize a pretrained model. In my opinion, this is extremely confusing as it makes the other init args + kwargs ambiguous/useless.

Pitch

add .from_pretrained classmethod to models and initialize an instance of the class based off of that. Pretrained models should incorporate any hparams needed to fill out init, I guess.

from pl_bolts.models import VAE

model = VAE.from_pretrained('imagenet2012')

Alternatives

Additional context

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:5
  • Comments:5 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
williamFalconcommented, Sep 11, 2020

oh i see. it’s an id not a dataset. yeah that works.

for instance we can have many backbones with different datasets as well

CPC.from_pretrained('resnet18-imagenet')
CPC.from_pretrained('resnet50-imagenet')
CPC.from_pretrained('resnet18-stl10')
1reaction
ananyahjha93commented, Sep 12, 2020

@williamFalcon @Borda @nateraw I included this pattern in the latest AE, VAE commits to bolts. Few points that I realized:

  1. We can shift the method from_pretrained() as a method to override in Lightning itself.
  2. from_pretrained() needs to be an instance method and not a static method. In most cases, you will initialize the lightning module with specific params according the the weights being loaded.
vae = VAE(input_height=32, first_conv=True)
vae = vae.from_pretrained('cifar10-resnet18')

In this example stl10 weights have a different configuration for the encoder of the VAE. But, at the same time the internal method has a strict=False flag while loading so that users can load stl10 weights to the encoder configuration of cifar10 dataset.

  1. Having this pattern allows us to test the correct loading of weights using the from_pretrained() function. @williamFalcon cases like the corrupt ImageNet weights for CPC will be caught automatically.

I have added all of this + tests for the AE and VAE classes I have updated for bolts.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Rethink usage pattern for pretrained models #200 - GitHub
Seems we are following torchvision's pattern of having a 'pretrained' argument in the init of our models to initialize a pretrained model. In...
Read more >
Transfer learning from pre-trained models | by Pedro Marcelino
This article teaches you how to use transfer learning to solve image classification problems. A practical example using Keras and its ...
Read more >
Pre-trained models: Past, present and future - ScienceDirect
Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved great success and become a milestone in the field of artificial ......
Read more >
[1811.08883] Rethinking ImageNet Pre-training - arXiv
Experiments show that ImageNet pre-training speeds up convergence early in training, but does not necessarily provide regularization or improve ...
Read more >
Pretrained Models For Text Classification - Analytics Vidhya
XLNet · ERNIE · Text-to-Text Transfer Transformer (T5) · Binary Partitioning Transfomer (BPT) · Neural Attentive Bag-of-Entities (NABoE) · Rethinking ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found