question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Loading finetuned checkpoint. Local method not working.

See original GitHub issue

Hi! For some time I though it was a correct way to load pretrained named model (such as tera, mockingjay, cpc) provided in example_extract.py files:

ckpt_path=`/path/to/valid/ckpt-1337steps.pt`
Upstream_local = getattr(importlib.import_module('hubconf'), 'tera')
model = Upstream_local(ckpt=ckpt_path).to(device)

But then I noticed it never uses ckpt to load model. However, I was using <model_name> instead of <model_name>_local as written in the examples because of errors with last one method:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-23-44af8ca274e3> in <module>()
     11 ckpt_root = '/content/TeraFineTuned/'
     12 Upstream_local = getattr(importlib.import_module('hubconf'), 'tera_local')
---> 13 model = Upstream_local(ckpt=ckpt_root+"states-150000.ckpt").to(device)

3 frames
/content/S3PRL/upstream/tera/hubconf.py in tera_local(ckpt, feature_selection, *args, **kwargs)
     15     if feature_selection is None:
     16         feature_selection = -1
---> 17     return _UpstreamExpert(ckpt, feature_selection, *args, **kwargs)
     18 
     19 

/content/S3PRL/upstream/mockingjay/expert.py in __init__(self, ckpt, feature_selection, **kwargs)
     37                    'permute_input' : 'False' }
     38 
---> 39         self.transformer = PretrainedTransformer(options, inp_dim=-1)
     40         assert hasattr(self.transformer, 'extracter'), 'This wrapper only supports `on-the-fly` ckpt with built in feature extracters.'
     41 

/content/S3PRL/upstream/mockingjay/builder.py in __init__(self, options, inp_dim, config, online_config, verbose)
    255     """
    256     def __init__(self, options, inp_dim, config=None, online_config=None, verbose=False):
--> 257         super(PretrainedTransformer, self).__init__(options, inp_dim, config, online_config, verbose)
    258 
    259         # Build model

/content/S3PRL/upstream/mockingjay/builder.py in __init__(self, options, inp_dim, config, on_the_fly_config, verbose)
     60 
     61         # Set model config
---> 62         self.model_config = TransformerConfig(self.config['transformer'])
     63         self.hidden_size = self.model_config.hidden_size
     64         self.num_layers = self.model_config.num_hidden_layers

KeyError: 'transformer'

I think it is a bug. Is there any valid way to load fine-tuned checkpoint for now?

Edit: Same error if using torch.hub.load:

model = torch.hub.load('s3prl/s3prl', 'tera_local', ckpt=ckpt_path).to(device)

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
leo19941227commented, Feb 28, 2021

Hey,

This sounds interesting that perhaps unsupervised pretraining -> supervised finetuning have some essential difference with supervised pretraining from scratch, and we can examine whether the former yields better results. Sure then I will come up a way to support this.

0reactions
SolomidHerocommented, Feb 6, 2021

Hi, thank you for your interest.

In my opinion, since tera and mockingjay are transformer representations learned via reconstruction loss on predicting mels their representations are robust and have high quality on reproducing mels (then speech). But if I need to extract only features dependent on timbre/acoustic or only linguistic features, these representations can be improved by finetuning, thus decreasing the leakage of other (not needed) features.

In my experiment, I wonder if my models quality could be improved if representations used as input PPG features (basically linguistic only, w/o acoustics) would be produced by finetuned tera/mockingjay (now it uses pre-trained). Here I denote PPG features as hidden state of model trained for ASR or phone recognition tasks.

Also, words above are applicable to other models (cpc, apc, wav2vec2, etc)

Read more comments on GitHub >

github_iconTop Results From Across the Web

Loading model from checkpoint after error in training - Beginners
Let's say I am finetuning a model and during training an error is encountered and the training stops. Let's also say that, using...
Read more >
How to load a fine tuned pytorch huggingface bert model from ...
Just save your model using model.save_pretrained, here is an example: model.save_pretrained("<path_to_dummy_folder>").
Read more >
A Guide To Using Checkpoints — Ray 2.2.0
Tune stores checkpoints on the node where the trials are executed. If you are training on more than one node, this means that...
Read more >
Checkpointing — PyTorch Lightning 1.6.0 documentation
Lightning provides functions to save and load checkpoints. ... fine-tune a model or use a pre-trained model for inference without having to retrain...
Read more >
Saving and loading a general checkpoint in PyTorch
In this recipe, we will explore how to save and load multiple checkpoints. Setup. Before we begin, we need to install torch if...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found