Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Loading finetuned checkpoint. Local method not working.

See original GitHub issue

Hi! For some time I though it was a correct way to load pretrained named model (such as tera, mockingjay, cpc) provided in example_extract.py files:

ckpt_path=`/path/to/valid/ckpt-1337steps.pt`
Upstream_local = getattr(importlib.import_module('hubconf'), 'tera')
model = Upstream_local(ckpt=ckpt_path).to(device)

But then I noticed it never uses ckpt to load model. However, I was using <model_name> instead of <model_name>_local as written in the examples because of errors with last one method:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-23-44af8ca274e3> in <module>()
     11 ckpt_root = '/content/TeraFineTuned/'
     12 Upstream_local = getattr(importlib.import_module('hubconf'), 'tera_local')
---> 13 model = Upstream_local(ckpt=ckpt_root+"states-150000.ckpt").to(device)

3 frames
/content/S3PRL/upstream/tera/hubconf.py in tera_local(ckpt, feature_selection, *args, **kwargs)
     15     if feature_selection is None:
     16         feature_selection = -1
---> 17     return _UpstreamExpert(ckpt, feature_selection, *args, **kwargs)
     18 
     19 

/content/S3PRL/upstream/mockingjay/expert.py in __init__(self, ckpt, feature_selection, **kwargs)
     37                    'permute_input' : 'False' }
     38 
---> 39         self.transformer = PretrainedTransformer(options, inp_dim=-1)
     40         assert hasattr(self.transformer, 'extracter'), 'This wrapper only supports `on-the-fly` ckpt with built in feature extracters.'
     41 

/content/S3PRL/upstream/mockingjay/builder.py in __init__(self, options, inp_dim, config, online_config, verbose)
    255     """
    256     def __init__(self, options, inp_dim, config=None, online_config=None, verbose=False):
--> 257         super(PretrainedTransformer, self).__init__(options, inp_dim, config, online_config, verbose)
    258 
    259         # Build model

/content/S3PRL/upstream/mockingjay/builder.py in __init__(self, options, inp_dim, config, on_the_fly_config, verbose)
     60 
     61         # Set model config
---> 62         self.model_config = TransformerConfig(self.config['transformer'])
     63         self.hidden_size = self.model_config.hidden_size
     64         self.num_layers = self.model_config.num_hidden_layers

KeyError: 'transformer'

I think it is a bug. Is there any valid way to load fine-tuned checkpoint for now?

Edit: Same error if using torch.hub.load:

model = torch.hub.load('s3prl/s3prl', 'tera_local', ckpt=ckpt_path).to(device)

Issue Analytics

State:
Created 3 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

1reaction

leo19941227commented, Feb 28, 2021

Hey,

This sounds interesting that perhaps unsupervised pretraining -> supervised finetuning have some essential difference with supervised pretraining from scratch, and we can examine whether the former yields better results. Sure then I will come up a way to support this.

0reactions

SolomidHerocommented, Feb 6, 2021

Hi, thank you for your interest.

In my opinion, since tera and mockingjay are transformer representations learned via reconstruction loss on predicting mels their representations are robust and have high quality on reproducing mels (then speech). But if I need to extract only features dependent on timbre/acoustic or only linguistic features, these representations can be improved by finetuning, thus decreasing the leakage of other (not needed) features.

In my experiment, I wonder if my models quality could be improved if representations used as input PPG features (basically linguistic only, w/o acoustics) would be produced by finetuned tera/mockingjay (now it uses pre-trained). Here I denote PPG features as hidden state of model trained for ASR or phone recognition tasks.

Also, words above are applicable to other models (cpc, apc, wav2vec2, etc)

Top Results From Across the Web

Loading model from checkpoint after error in training - Beginners

Let's say I am finetuning a model and during training an error is encountered and the training stops. Let's also say that, using...

How to load a fine tuned pytorch huggingface bert model from ...

Just save your model using model.save_pretrained, here is an example: model.save_pretrained("<path_to_dummy_folder>").

A Guide To Using Checkpoints — Ray 2.2.0

Tune stores checkpoints on the node where the trials are executed. If you are training on more than one node, this means that...

Checkpointing — PyTorch Lightning 1.6.0 documentation

Lightning provides functions to save and load checkpoints. ... fine-tune a model or use a pre-trained model for inference without having to retrain...

Saving and loading a general checkpoint in PyTorch

In this recipe, we will explore how to save and load multiple checkpoints. Setup. Before we begin, we need to install torch if...