question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

IEMOCAP recipe - missing lengths in call to embedding module ?

See original GitHub issue

Hello Speechbrain team,

While trying to reproduce the results mentioned in https://github.com/speechbrain/speechbrain/tree/develop/recipes/IEMOCAP I figured out that the accuracy on the test set varies w.r.t the batch size in the test set dataloader, sometimes reducing the accuracy by up to 7% when batch_size = 1.

Further investigations in the code made me think that the ‘lengths’ parameter might be missing in the call to the embedding module (train.py line 37).

I tried to change it to: embeddings = self.modules.embedding_model(feats, lengths=lens) and ran the train and test stages again. Then the variations are less important but are still present : around 1% difference from one test to another (train stage done only once).

So I have 2 questions:

  1. Do you think the modification hereabove is correct?
  2. What could explain the remaining fluctuations with respect to the batch size of the test set?

Thank you in advance

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
wikongcommented, Oct 18, 2021

Hi @aheba @TParcollet @mravanelli Thank you for the fix, and also for the additional work with wav2vec2. Very interesting indeed…

1reaction
mravanellicommented, Oct 16, 2021

I think the correct way is to pass the lenghts parameter to the embedding model (this way zero padded elements are removed from the statistical pooling operation). @aheba, can we do it in your ongoing PR?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Releases · pytorch/audio - GitHub
This is a minor release, which is compatible with PyTorch 1.13.1 and includes bug fixes, improvements and documentation updates.
Read more >
Robust Methods for the Automatic Quantification and ...
Confusion matrices obtained using speaker embeddings in the cross- corpus setting when (a) training on IEMOCAP and testing on MSP-.
Read more >
Learning Alignment for Multimodal Emotion Recognition from ...
Each utterance in the IEMOCAP dataset is labeled by three annotators, and we assign a single category to each utterance by majority vote....
Read more >
Cross-view Learning with Limited Supervision
Abstract. Real-world data is often multi-view, with each view representing a different perspective of the data. These views can be different modalities, ...
Read more >
Recent Advances in Deep Learning Based Dialogue Systems
This is due to the fixed input length and limited convolution span of CNNs. ... an 'I' module which maps the input memory...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found