Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to use transfer learning ?

See original GitHub issue

https://github.com/espnet/espnet/blob/f7e3dbb8f1e011c41973c4c77403edf8ac684126/espnet/bin/asr_train.py#L182-L184

Hi, I can’t find where the --asr-model arg is used, anyone do me a favor please ?

Issue Analytics

State:
Created 4 years ago
Comments:27 (9 by maintainers)

Top GitHub Comments

1reaction

b-flocommented, Jun 28, 2019

If you’re using the resume function as @kan-bayashi said, the pre-trained model architecture should match the architecture you specified in your training config. In your case the only mismatched parameter is odim (your output dimension with extra labels included) so two possibilities :

The “bigger” dictionnary you used didn’t have the same size as in the pre-trained model you use, same problem with the smaller one.
You use an old ESPNET model and there was a change for odim (for the extra labels I guess). But I guess it should have been obvious from the logs.

Edit : By the way if you are wondering, the shape printed in your log is the shape of the weight matrix. For example in Linear the weights are transposed before applying matmul.

1reaction

desothier1commented, Jun 22, 2019

OK, previously I used --resume <your_pretrained_model_path> for resuming, and continuing training, starting from a previously trained model, but using exactly the same train data i used previously

I mean, transfer learning, starting from a previously trained model, but using different train data. with just specifying the path to the previously trained model, --resume <your_pretrained_model_path>, this did not work.

Or do you simply mean that on top of using --resume <your_pretrained_model_path> I should change torch_resume(args.resume, trainer) into torch_load(args.resume, model) first (i guess not…)?

Thanks for your answer