IEMOCAP recipe - missing lengths in call to embedding module ?
See original GitHub issueHello Speechbrain team,
While trying to reproduce the results mentioned in https://github.com/speechbrain/speechbrain/tree/develop/recipes/IEMOCAP I figured out that the accuracy on the test set varies w.r.t the batch size in the test set dataloader, sometimes reducing the accuracy by up to 7% when batch_size = 1.
Further investigations in the code made me think that the ‘lengths’ parameter might be missing in the call to the embedding module (train.py line 37).
I tried to change it to:
embeddings = self.modules.embedding_model(feats, lengths=lens)
and ran the train and test stages again. Then the variations are less important but are still present : around 1% difference from one test to another (train stage done only once).
So I have 2 questions:
- Do you think the modification hereabove is correct?
- What could explain the remaining fluctuations with respect to the batch size of the test set?
Thank you in advance
Issue Analytics
- State:
- Created 2 years ago
- Comments:5
Top GitHub Comments
Hi @aheba @TParcollet @mravanelli Thank you for the fix, and also for the additional work with wav2vec2. Very interesting indeed…
I think the correct way is to pass the
lenghts
parameter to the embedding model (this way zero padded elements are removed from the statistical pooling operation). @aheba, can we do it in your ongoing PR?