Default training not converging
See original GitHub issueRunning just the train_softmax.py
command in the example run.sh
script with the sample data doesn’t seem to converge, even at 50 epochs.
Command:
python -u ./code/1-development/train_softmax.py --num_epochs=50 --batch_size=3 --development_dataset_path=data/development_sample_dataset_speaker.hdf5 --train_dir=results/TRAIN_CNN_3D/train_logs
Output:
Loss:
Learning rate:
Issue Analytics
- State:
- Created 6 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Things to try when Neural Network not Converging
I know that the things to try would vary widely depending on network architecture. But tweaking which parameters (learning rate, momentum, initial weights,...
Read more >When does a neural network fail to converge?
In simple words, we can think of failure in convergence as a condition where we can't find the convergence point in the learning...
Read more >training - Weights not converging while cost function has ...
The weights in a model do not need to converge to stop training. One possible explanation is that the model error surface has...
Read more >When does the problem arise of neural networks not ... - Quora
It can be that the series that elapses the flow diverges - meaning it does not converge by definition - it doesn't close...
Read more >Training using custom dataset does not converge #920 - GitHub
Training using custom dataset does not converge #920 ... So I left the default anchor ratios since they are valid. I have tried...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Section VI
However, I accept it could be explained in a better way in the paper.
About the number of samples per speaker, In my experiments, I had more than 50 per speaker. Perhaps if you take 50 for development and 50 for enrollment/evaluation it can demonstrate a fair experiment. However, your background model may not be that strong for generalization. In any case, you can give it a shot!
Oops, I might have had an old version of the paper:
I’ll have a look at the newest version and re-open if there’s an issue. Thanks for clarifying!