question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

I’m trying to follow the recipe for speaker diarization on the AMI dataset (https://github.com/speechbrain/speechbrain/tree/develop/recipes/AMI/Diarization) but unfortunately without success. Here’s the output:

...
speechbrain.utils.parameter_transfer - Loading pretrained files for: embedding_model, mean_var_norm_emb
__main__ - Tuning for p-value for SC (Multiple iterations over AMI Dev set)
__main__ - Diarizing dev set
__main__ - No recording IDs found! Please check if meta_data json file is properly generated.

I have downloaded the data and set the variables in the config files accordingly, i.e.:

data_folder: .../AMI/amicorpus/
manual_annot_folder: .../AMI/ami_public_manual_1.6.2 

where amicorpus looks as follows: amicorpus/EN2009d/audio/EN2009d.Mix-Headset.wav

I’m running this using device: 'cpu'

I checked the results/…/metadata folder and I see that ami_dev.Mix-Headset.subsegs.json and eval.Mix-Headset.subsegs.json are empty, while ami_train.Mix-Headset.subsegs.json contains a dict of elemts like

"EN2009d_0.0_2.99": {
    "wav": {
      "file": "/Users/jonas/Desktop/Translated/ASR/datasets/AMI/amicorpus//EN2009d/audio/EN2009d.Mix-Headset.wav",
      "duration": 2.99,
      "start": 0,
      "stop": 47840
    }
  }

I would really appreciate some help! Am I missing anything?

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:14

github_iconTop GitHub Comments

1reaction
nauman-dawcommented, Oct 25, 2021

Hi,

Yes. “No recording IDs found! Please check if meta_data json file is properly generated.” should be related to improper paths. Please check “<filename>.subsegs.json” as this will be by your experiment.py.

@LONG520520 please feel free to open a PR for this, I will check it. Even if the PR suggests some useful points on how to avoid these path errors, it will be very helpful for others.

thank you very much!

0reactions
nauman-dawcommented, Jan 17, 2022

Closing this path issue for now.

Read more comments on GitHub >

github_iconTop Results From Across the Web

BUTSpeechFIT/AMI-diarization-setup - GitHub
AMI -diarization-setup · All words are considered as speech and included in the references. · Speaker turns respect precisely the annotations, but adjacent...
Read more >
AMI Corpus
The AMI Meeting Corpus is a multi-modal data set consisting of 100 hours of meeting recordings. For a gentle introduction to the corpus,...
Read more >
ami · Datasets at Hugging Face
The AMI Meeting Corpus consists of 100 hours of meeting recordings. ... speaker-diarization : The dataset can be used to train model for...
Read more >
AMI Benchmark (Speaker Diarization) - Papers With Code
The current state-of-the-art on AMI is pyannote (waveform). See a full comparison of 2 papers with ... Speaker Diarization on AMI. Leaderboard; Dataset....
Read more >
The AMI speaker diarization system for NIST RT06s meeting ...
Abstract. We describe the systems submitted to the NIST RT06s eval- uation for the Speech Activity Detection (SAD) and Speaker Diarization. (SPKR) tasks....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found