Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Constructing multi-channel audio buffers

See original GitHub issue

Description

In testing for #960 , we discovered that waveplot tests are passing when they should fail. The cause is subtle: waveplot makes stereo envelopes by framing each channel separately, which means that a stereo signal needs to be C-contiguous for this to work. This was implicit behavior before, and #960 ensures that multi-channel audio is F-contiguous. However, the test still passes because the fixture for stereo waveplot constructs stereo signals by vstacking two mono signals, which ends up producing C-contiguous data.

The point of this story is that vstacking mono signals to get stereo sort of works, but it’s not the “right” way to do it: the byte ordering gets messed up. There isn’t an obviously easy way to do this properly, so we should make one.

We already have to_mono for downmixing stereo (or general multichannel) to mono by averaging. It stands to reason that we should have an easy way to go the opposite direction: combine multiple input signals to a single multi-channel array with proper ordering.

The obvious name here is to_stereo, but that’s not quite right if we want multichannel. to_multichannel is a bit of a mouthful, but could work. Anyone have other suggestions? Maybe upmix, and we can have an alias of downmix = to_mono? I worry about people expecting fancier downmixing behaviors though (eg 5.1 -> stereo).

Issue Analytics

State:
Created 4 years ago
Comments:5 (5 by maintainers)

Top GitHub Comments

1reaction

bmcfeecommented, Aug 16, 2019

It occurs to me that one might want to apply this operation (F-ordered concatenation) on things other than waves: for example, harmonic interpolation should produce an F-ordered array of shape (n_harmonics, frame_length, n_frames). Likewise for multi-band tempograms, etc. In general, I can imagine wanting to do this on arbitrary features of compatible dimension, and expecting the output to be coherently aligned.

Maybe we should just call it librosa.util.stack? This would basically be np.stack, except the output array would be pre-allocated with F-contiguity if axis=0 (new dimension is first) and C-contiguity if axis=-1 (new dimension is last). The resulting stack would be compatible with nd-framing #944 in both events.

0reactions

bmcfeecommented, Aug 16, 2019

Ok – stack it is! I think it probably makes sense to roll this in with the PR for #944 , since it’s a small function on its own, but it will be necessary to fix the things that #944 breaks (eg waveplot tests).

Top Results From Across the Web

Handling multi-channel audio in NAudio

Because buffers might be reused, it is important that we zero out the output buffer if there was no available input data. Here's...

What is the "proper" way to sum multi channel audio buffers ...

I am using the channel splitter and merger to attempt to split a stereo file into two discrete channels and then funnel them...

Multichannel audio - approaches?

... like to gather some info on building a multitchannel audio app, ... Fortunately, you supply a non-zero buffer size to its constructor ......

symphonia_core::audio - Rust

The audio module provides primitives for working with multi-channel audio buffers of varying sample formats. Structs. AudioBuffer. AudioBuffer is a container ...

libfluidsynth: Multi-channel audio rendering

FluidSynth is capable of rendering all audio and all effects from all MIDI channels to separate stereo buffers. Refer to the documentation of ......