Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Slow loading of MonoCut created from multi-channel Recording

See original GitHub issue

Consider the situation where I have an 8-channel wav file (e.g., in AliMeeting). I create a Recording object from this file, and create several MonoCut objects for different supervisions on channel 0 of the recording (similar to “SDM” setting in AMI). Now, if I need to load audio for the cuts (for instance, to extract features), it is much slower than if the audio was originally single-channel.

For example, on the AliMeeting train set, computing features using compute_and_store_features_batch() takes approx. 2h for IHM data (which is single-channel recordings) vs. 14h for SDM data (8-channel recordings).

This issue is already noted in the comment here, but I just wanted to raise this explicitly to invite ideas about whether something could be done for selectively reading channels from the AudioSource.

Issue Analytics

State:
Created 9 months ago
Comments:5

Top GitHub Comments

1reaction

pzelaskocommented, Dec 7, 2022

Regarding to processing time of compute_and_store_features_batch, I think that it mainly because of disk IO. So we can create special kind of SimpleCutSampler that prefetch audio into memory using a dedicated thread inside that class, e.g. SimplePrefetchedCutSample…

In these cases just use a dataloader with num workers > 0 and an unsupervised waveform dataset.

1reaction

desh2608commented, Dec 7, 2022

I don’t think disk IO is a bottleneck here. I have found the method quite fast when computing features for data which contain 1 utterance per recording (like LibriSpeech).

Top Results From Across the Web

Issues · lhotse-speech/lhotse - GitHub

Using lhotse cut simple to merge unsorted recordings.jsonl.gz and ... Slow loading of MonoCut created from multi-channel Recording enhancement New feature ...

Source code for lhotse.audio

Recording ` can be simply created from a local audio file:: >>> from lhotse import RecordingSet ... from lhotse.cut import MonoCut, MultiCut cls...

Mechanisms underlying the long-term survival of the monocot ...

Abstract. Efficient water management is essential for the survival of vascular plants under drought stress. While interrelations among drought stress, ...

Chapter 1 0:- Mixing

mixing is called re-recording or dubbing. In ... dubbing, but its slow speed access to differ- ... multichannel digital audio tape recorders.

Untitled

Dongsha atoll marine national park, Kim long bob, Distance between 2 negative ... Green bay packers record 1998, Como se apaixonar, Beretta 92...