question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot get 1.1771 eer (a large gap) with the pretrained model in README

See original GitHub issue

In the README, it mentions that:

A larger model trained with online data augmentation, described in [2], can be downloaded from here.

The following script should return: EER 1.1771.

python ./trainSpeakerNet.py --eval --model ResNetSE34V2 --log_input True --encoder_type ASP --n_mels 64 --trainfunc softmaxproto --save_path exps/test --eval_frames 400  --initial_model baseline_v2_ap.model

However, with the pretrained baseline_v2_ap model and same script, I can only get EER 1.7073, MinDCF 0.12625. Is there a mistake in your model or code?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:1
  • Comments:12 (5 by maintainers)

github_iconTop GitHub Comments

6reactions
msh9184commented, Mar 1, 2022

As in the comment of @chmod740, it seems to be a BC (breaking changes) issue depending on the torch version.

From the torch version 1.10.0, the torch.nn.functional.pairwise_distance function computes the pairwise distance between vectors. (Unlike the batchwise pairwise distance as in the version 1.7.0 ~ 1.9.1)

It seems to be compatible if you use torch.cdist function, for instance,

223    #dist = F.pairwise_distance(ref_feat.unsqueeze(-1), com_feat.unsqueeze(-1).transpose(0,2)).detach().cpu().numpy();
224    dist = torch.cdist(ref_feat.reshape((num_eval, -1)), com_feat.reshape((num_eval, -1))).detach().cpu().numpy();

https://github.com/clovaai/voxceleb_trainer/blob/9481143015109554840dcdd4a9bbefce897114a2/SpeakerNet.py#L223


Result using torch.nn.functional.pairwise_distance() in pytorch version 1.10 image

Result using torch.cdist() in pytorch version 1.10 image

4reactions
chmod740commented, Mar 1, 2022

I also found the same problem on the master branch, but I have successfully reproduced this result a year or two ago. I compared the two codes and found some differences. First of all, you should switch the version of pytorch to 1.7.1, because I have successfully reproduced in this version of pytorch, I compared pytorch 1.7.1 and 1.10.0, the torch.nn.functional.pairwise_distance used to calculate the score function has changed. QQ20220301-121829 In the second step, you can try to add the following line of code to the forward method in ResNetSE34V2.py, because the wavfile module is used to read the speech instead of the soundfile module in the previous code. When read with the wavfile module, the voice is not normalized.

    def forward(self, x):
        x *= 32768.0
        with torch.no_grad():
            with torch.cuda.amp.autocast(enabled=False):
                x = self.torchfb(x)+1e-6
                if self.log_input: x = x.log()
                x = self.instancenorm(x).unsqueeze(1)
Read more comments on GitHub >

github_iconTop Results From Across the Web

Cannot find module '@schematics/angular/utility' - IssueHint
Cannot get 1.1771 eer (a large gap) with the pretrained model in README, 12, 2022-02-28 ; Parallel authentications issue, 1, 2019-12-06 ; Introspect...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found