Discussions for training / VoxSRC
See original GitHub issue- Changing
--n_mels
from 40 to 64 leads to a small increase in performance. - Using
--log_input
also leads to a small increase in performance. - Combining two loss functions (e.g.
angleproto
andsoftmax
) sometimes has positive effect. This should be defined as a new loss function that returns the sum of two losses in theloss
directory. - Zero padding of the input causes to a significant adverse effect on performance. When there is a large variation in the length of input audio files (e.g. VoxSRC), I recommend
--eval_frames 0
which uses whatever length of audio is available without padding or cropping.
For example, this configuration gives 1.98% EER using the standard train and test lists. I believe that many of you have trained better models using this trainer. I would appreciate if you are able to share your knowledge!
Issue Analytics
- State:
- Created 3 years ago
- Comments:33
Top Results From Across the Web
The VoxCeleb Speaker Recognition Challenge 2019 - CodaLab
Audio only speaker verification - Open training data. Welcome to the VoxCeleb speaker verification challenge! The goal of this challenge is to ...
Read more >VoxSRC Workshop 2021 - University of Oxford
Welcome to the VoxSRC Workshop 2021!. The workshop includes presentations from the most ... 3:30pm, Participant talks from Tracks 1, 2 and 3....
Read more >VoxSRC-22 workshop
Training data: There are two closed tracks (1 and 3) and one open track (track 2) for speaker verification. Tracks 1 & 2:...
Read more >arXiv:2201.04583v2 [cs.SD] 16 Nov 2022
VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge ... straining the data allowed for training models, though with a.
Read more >VoxSRC 2020: The Second VoxCeleb Speaker Recognition ...
We conclude with a discussion of the progress over the first installment of ... however unlike NIST, all training data for VoxSRC is...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
@zh794390558 Did you solve your problem of slow training? I am having the same problem, one epoch takes almost 3 hours (sometimes more than that) on 8 Tesla T4 GPUs using distributed training.
But my case is a little different, explained here in detail.
If you have solved your problem, could you please share your solution?
Hello, Joonson. Thank you for your ideas. For——log_input features, what is the principle of this method? How to improve performance. Hope you can reply Thank you