question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slow loading of MonoCut created from multi-channel Recording

See original GitHub issue

Consider the situation where I have an 8-channel wav file (e.g., in AliMeeting). I create a Recording object from this file, and create several MonoCut objects for different supervisions on channel 0 of the recording (similar to “SDM” setting in AMI). Now, if I need to load audio for the cuts (for instance, to extract features), it is much slower than if the audio was originally single-channel.

For example, on the AliMeeting train set, computing features using compute_and_store_features_batch() takes approx. 2h for IHM data (which is single-channel recordings) vs. 14h for SDM data (8-channel recordings).

This issue is already noted in the comment here, but I just wanted to raise this explicitly to invite ideas about whether something could be done for selectively reading channels from the AudioSource.

Issue Analytics

  • State:open
  • Created 9 months ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
pzelaskocommented, Dec 7, 2022

Regarding to processing time of compute_and_store_features_batch, I think that it mainly because of disk IO. So we can create special kind of SimpleCutSampler that prefetch audio into memory using a dedicated thread inside that class, e.g. SimplePrefetchedCutSample…

In these cases just use a dataloader with num workers > 0 and an unsupervised waveform dataset.

1reaction
desh2608commented, Dec 7, 2022

I don’t think disk IO is a bottleneck here. I have found the method quite fast when computing features for data which contain 1 utterance per recording (like LibriSpeech).

Read more comments on GitHub >

github_iconTop Results From Across the Web

Issues · lhotse-speech/lhotse - GitHub
Using lhotse cut simple to merge unsorted recordings.jsonl.gz and ... Slow loading of MonoCut created from multi-channel Recording enhancement New feature ...
Read more >
Source code for lhotse.audio
Recording ` can be simply created from a local audio file:: >>> from lhotse import RecordingSet ... from lhotse.cut import MonoCut, MultiCut cls...
Read more >
Mechanisms underlying the long-term survival of the monocot ...
Abstract. Efficient water management is essential for the survival of vascular plants under drought stress. While interrelations among drought stress, ...
Read more >
Chapter 1 0:- Mixing
mixing is called re-recording or dubbing. In ... dubbing, but its slow speed access to differ- ... multichannel digital audio tape recorders.
Read more >
Untitled
Dongsha atoll marine national park, Kim long bob, Distance between 2 negative ... Green bay packers record 1998, Como se apaixonar, Beretta 92...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found