Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Audio mixing function

See original GitHub issue

I was wondering if it would be useful to have a function to mix two audio samples with a given ratio. (e.g. Mix two audio with 0.5/0.5 or 0.2/0.8 ratio). Usually as far as I know, mixing audio is adding audio samples and clipping. Or it could be normalized by the number of audio being mixed.

I have been naively mixing audio with ratio, r by : mixed_audio = r * audio_A + (1-r) * audio_B but the result normally doesn’t quite match my expectation as you can easily imagine. So some time ago I came across this paper and code about using A-weighting that takes into consideration the human perception to improve the naive method for real auditory experience, and I am still referring to this paper/code till now.

Mixing function that allows both basic and a-weighted method would be helpful in various situations like data augmentation and creation. Any thoughts on implementing a simple mixing function in librosa? 😃 Thanks!

Issue Analytics

State:
Created 5 years ago
Comments:8 (6 by maintainers)

Top GitHub Comments

2reactions

justinsalamoncommented, Nov 30, 2018

@kyungyunlee great, glad to hear it, and cheers @lostanlen for the shout out 😃

2reactions

lostanlencommented, Nov 26, 2018

Hi @kyungyunlee!

At NYU MARL we are already maintaining a library for musical data augmentation named muda (MUsical Data Augmentation). I you haven’t already done so, please see ISMIR 2015 paper by @bmcfee and Bello: http://bmcfee.github.io/papers/ismir2015_augmentation.pdf It features a BackgroundNoise object-oriented interface for mixing two signals with random weights, one belonging to a pool of “foreground” sounds and the other belonging to a pool of “background” sounds: https://muda.readthedocs.io/en/latest/deformers.html#muda.deformers.BackgroundNoise
If you have more than one foreground sound to mix over the background, I would recommend using @justinsalamon’s soundscape generator library scaper: https://github.com/justinsalamon/scaper See his paper from WASPAA 2017
From a deterministic standpoint, this can be achieved with a Combiner from @rabitt’s pysox: https://pysox.readthedocs.io/en/latest/api.html#module-sox.combine See her paper from ISMIR-LBD 2016

I personally am a user (and occasional contributor) to each of these three libraries so unless there is a compelling use case for it that is not covered by any of them, I would tend to prefer avoiding porting it to librosa.

Does that make sense?