Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Add Support for large audio files ( > 2GB)

See original GitHub issue

Until this is built, some users will be able to get away with breaking the audio into chunks like in my comment on #124

Try implementing the StreamingAudioSegment class - which should be an API compatible implementation of AudioSegment (which may use AudioSegment internally?) which provides the same interface/methods, but does not load the complete audio into RAM. If there are significant roadblocks there, perhaps just utilities which do individual memory-intensive operations (not as nice a solution).

Outline of an approach:

The new VeryMemoryConsciousAudioSegment (still workshopping names) could do the audio conversions up front (like they are now) to standard wave data on disk in temp files. All operations on these instances would just pile up in a list until the moment when the actual audio data is needed (like an export, or retrieving info like duration, or loudness).

When the audio data is needed, all pending operations would be applied and the result stored in a new temp file on disk in order to avoid reapplying the operations over and over.

As I think more about this, it seems like this has some downsides (much more disk intensive, harder to do operations that inspect the audio data like getting loudness). I’m becoming more convinced that the current in-memory AudioSegment will need to stick around for some uses even if we get to a completely feature complete Streaming/On-disk implementation.

note: I was originally going to commandeer #124, then #51, and finally settled on adding a new ticket.

Also related: #101

Issue Analytics

State:
Created 7 years ago
Reactions:12
Comments:14 (2 by maintainers)

Top GitHub Comments

2reactions

exit99commented, May 3, 2018

Any news on this issue?

0reactions

jzohrabcommented, May 6, 2022

I had a similar use case as @dyyd above, and will be using ffmpeg-python for some preprocessing. Perhaps a reference list of “recipes” for people would be helpful – e.g. I couldn’t figure out how to pipe binary from ffmpeg-python to a constructor, so ended up using a temp file.

Code in case it helps anyone:

import ffmpeg
import pydub
from tempfile import NamedTemporaryFile

def audiosegment_from_mp3_time_range(path_to_mp3, starttime_s, duration_s):
    seg = None
    with NamedTemporaryFile("w+b", suffix=".mp3") as f:
        ffmpeg_cmd = (
            ffmpeg
            .input(path_to_mp3, ss=starttime_s, t=duration_s)
            .output(f.name, acodec='copy')
            .overwrite_output()
        )
        ffmpeg_cmd.run()
        # print(f'wrote {f.name}')
        seg = pydub.AudioSegment.from_mp3(f.name)
    return seg