The Pre-trained model from `asr-transformer-transformerlm-librispeech` does not contain all required information
See original GitHub issueThe pre-trained model in
https://huggingface.co/speechbrain/asr-transformer-transformerlm-librispeech
does not contain the state information for speechbrain.processing.features.InputNormalization
.
The consequence is that during inference time.the.WER.differs.if you change batch size as the mean and stddev are computed using data within a batch.
Issue Analytics
- State:
- Created 2 years ago
- Comments:10
Top Results From Across the Web
Models - Hugging Face
The warning Weights from XXX not initialized from pretrained model means that the weights of XXX do not come pretrained with the rest...
Read more >Load a pre-trained model from disk with Huggingface ...
You just need to specify the folder where all the files are, and not the files directly. I think this is definitely a...
Read more >How do pre-trained models work? - Towards Data Science
It is only the final layers of our network, the layers that learn to identify classes specific to your project that need training....
Read more >Hugging Face Pre-trained Models: Find the Best One for Your ...
Computers do not process the information in the same way as humans and which ... Hugging Face has multiple transformers and models but...
Read more >Transfer Learning | Pretrained Models in Deep Learning
With all this accumulated information, the lectures that students get ... A pre-trained model may not be 100% accurate in your application, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @danpovey , thank you for the comments. The
InputNormalization
part needs a refactor (e.g, we want to vectorize more). @30stomercury is already working on that.x = (x - self.glob_mean.data) / (self.glob_std.data)
). However, this is something to fix.Thank you again for your help!
Also, in InputNormalization, the way it stores speaker statistics, I’m not sure if that’s considered totally normal in terms of testing protocol, if you end up remembering stats from speakers in training who might recur in test time. For the global stats, rather than saving/loading them it would be more normal I think to just register them as buffers, e.g.
self.register_buffer('glob_mean', ... )
(but you’d need to know the dimension ahead of time, I think). This will also cause ddp to sync the copy the stats of job 0 to all other jobs (not that this should matter much, for the global stats).