Speech Recognition Seems To Overfit
See original GitHub issueHi, I don’t know if this is an issue with the framework, but did not know where else to ask. I have been training the speech recognition example (speech.yml) for about 80 epochs on a Titan X within a tensorflow-gpu-py3 based docker image. For some reason, the training loss has gone way down, but my validation loss is still very high and the sample predictions it spits out are gibberish.
Example output:
Epoch 80/inf, loss=5.845: 100%|##########| 2432/2432 [07:04<00:00, 7.80samples/s]
Validating, loss=766.122: 94%|#########4| 256/271 [00:21<00:01, 13.70samples/s]
Prediction: "thg asi bw a hnta e tb incotnetk rndegibnrtrlan ty bmna ekftett trelaob"
Truth: "and what inquired missus macpherson has mary ann given you her love"
- Is this sort of behavior expected this early in training?
- How long would you expect to have to train this model to start getting reasonable results.
Issue Analytics
- State:
- Created 7 years ago
- Comments:6
Top Results From Across the Web
Do people “cheat” by overfitting test data - Ehud Reiter's Blog
From a scientific perspective, the key thing here is to make the scope clear in our hypothesis and claims. Eg, explicitly say that...
Read more >Improving sequence-to-sequence speech recognition training ...
One solution to the overfitting problem is increasing the amount of available training data and the variety exhibited by the training data with ......
Read more >What is Overfitting? - IBM
When the model memorizes the noise and fits too closely to the training set, the model becomes “overfitted,” and it is unable to...
Read more >Overfitting Mechanism and Avoidance in Deep Neural Networks
By separating samples into correctly and incorrectly classified ones, we show that they behave very differently, where the loss decreases in the correct...
Read more >Continuous speech recognition with ESP32 for numbers (0--9)
Overfitting - as its only my voice? I am using the default 1D model, that is pretty simple and my not be able...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@bharris47 We just put up some bigger data sets. Can you point your train url and checksum to one of these?
The 10 hour set is double the size of the default dataset for the stock speech example. You can keep going up in scale and by the 50 hour mark, you’re bound to start seeing pretty good output.
Note: The above datasets are fractions of the 100 hour dataset (100%=100p, 50%=50p, …) from librispeech. They should not be concatenated because: 100p contains 50p which contains 20p which contains 10p.
Glad to hear! Feel free to use our data format as a template for adding even more data! It’s just a simple tarball.