Constructing multi-channel audio buffers
See original GitHub issueDescription
In testing for #960 , we discovered that waveplot tests are passing when they should fail. The cause is subtle: waveplot makes stereo envelopes by framing each channel separately, which means that a stereo signal needs to be C-contiguous for this to work. This was implicit behavior before, and #960 ensures that multi-channel audio is F-contiguous. However, the test still passes because the fixture for stereo waveplot constructs stereo signals by vstacking two mono signals, which ends up producing C-contiguous data.
The point of this story is that vstacking mono signals to get stereo sort of works, but it’s not the “right” way to do it: the byte ordering gets messed up. There isn’t an obviously easy way to do this properly, so we should make one.
We already have to_mono
for downmixing stereo (or general multichannel) to mono by averaging. It stands to reason that we should have an easy way to go the opposite direction: combine multiple input signals to a single multi-channel array with proper ordering.
The obvious name here is to_stereo
, but that’s not quite right if we want multichannel. to_multichannel
is a bit of a mouthful, but could work. Anyone have other suggestions? Maybe upmix
, and we can have an alias of downmix = to_mono
? I worry about people expecting fancier downmixing behaviors though (eg 5.1 -> stereo).
Issue Analytics
- State:
- Created 4 years ago
- Comments:5 (5 by maintainers)
Top GitHub Comments
It occurs to me that one might want to apply this operation (F-ordered concatenation) on things other than waves: for example, harmonic interpolation should produce an F-ordered array of shape
(n_harmonics, frame_length, n_frames)
. Likewise for multi-band tempograms, etc. In general, I can imagine wanting to do this on arbitrary features of compatible dimension, and expecting the output to be coherently aligned.Maybe we should just call it
librosa.util.stack
? This would basically benp.stack
, except the output array would be pre-allocated with F-contiguity ifaxis=0
(new dimension is first) and C-contiguity ifaxis=-1
(new dimension is last). The resulting stack would be compatible with nd-framing #944 in both events.Ok –
stack
it is! I think it probably makes sense to roll this in with the PR for #944 , since it’s a small function on its own, but it will be necessary to fix the things that #944 breaks (eg waveplot tests).