Batch sizes
See original GitHub issueI’m trying to get some insight into batch sizes and whether or not the performance I’m seeing is expected. It seems that I can’t set batch sizes much more than say, 32, w/ my dual Titan Xs. It’s further my understanding that dataparallel
will split that batch of 32 across the two GPUs for an effective batch size of 16 per gpu per batch. The model I’m training is all default: 4 LSTM layers w/ 400 hidden units. Now this is a fair amount different than many of the DeepSpeech 2 configurations in the paper, but I am seeing references to them having batch sizes of 512 spread over 8 Titan X’s. This implies that whatever system they’re running allows them to support batches of 64 per gpu. Seems to me we should be able to get closer to this number unless I’m missing something. Any thoughts?
Issue Analytics
- State:
- Created 6 years ago
- Comments:8 (4 by maintainers)
Top GitHub Comments
We filter the data used to under a certain length. If you look at the librispeech.py script, for example, you can see that there is a flag there for doing filtering when the manifest is created.
@ryanleary : Ok I see. Thank you so much!