🚀 Feature Request: Loading audio data from BytesIO or memory
See original GitHub issue🚀 Feature
The load API does not support loading audio bytes from the memory. It would a great addition to be able to load file like object, e.g. BytesIO. This is would be similar to SoundFile’s read function (https://github.com/bastibe/SoundFile/blob/master/soundfile.py#L170)
Motivation
This addition will support a use case for reading audio as blobs directly from DB instead writing the files locally first.
Pitch
Without this feature, torchaudio.load is not useful for users who load files from DB and would love to use torchaudio for all audio operations.
Alternatives
SoundFile supports loading from bytes but currently does not support MP3 files. CommonVoice’s audio files are saved in MP3, which requires to convert to FLAC or WAV before training.
waveform, samplerate = sf.read(file=io.BytesIO(audio_bytes), dtype='float32')
Issue Analytics
- State:
- Created 3 years ago
- Reactions:15
- Comments:16 (7 by maintainers)
Top Results From Across the Web
Failed to load audio from io.BytesIO object - Beginners
I want to do ASR with wav2vec 2.0 and the Common Voice German dataset. After loading the data, I want to prepare the...
Read more >How to send BytesIO using requests post - python
Whenever I write to the buffer, the pointer will always point to the end of the buffer and waiting for a new write....
Read more >io — Core tools for working with streams — Python 3.11.1 ...
BufferedRandom provides a buffered interface to seekable streams. Another BufferedIOBase subclass, BytesIO , is a stream of in-memory bytes. The TextIOBase ...
Read more >Releases — Panel v0.14.2
See the HoloViz blog for a visual summary of the major features added in each release. Version 0.14.2#. Date: 2022-12-14. This release primarily...
Read more >npmsearchfullcat_npm143.txt - GitHub
=andy.potanin 2013-07-23 0.0.2 require watch load fs emitter events adventize ... file size amnesia Easy memory sharing (javascript variable/json) between…
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@mthrok and others.
I found a workaround for the memory leak that I described a comment of the bug I have reported: https://github.com/irmen/pyminiaudio/issues/19#issuecomment-663178015. The solution still uses miniaudio’s functionality but calls different function. The memory leak appears in pyminiaudio’s implementation of decode* functions, which do not release memory.
For those wishing to use pyminiaudio’s in memory MP3 decoder, here is a working code which I will be using in my Common Voice training. Note: I have reimplemented
mp3_read_f32
function because of the https://github.com/irmen/pyminiaudio/issues/18 bug and it currently does not report sample_rate back to the caller.@mthrok , thank you very much for you detailed quick response. This is very helpful.
I agree with you regarding the challenges and limitations of currently used back-ends.
After you have mentioned miniaudio library, I have checked out and I can confirm it perfectly satisfies my use case. Not only I can load MP3 data from memory but I can also down-sample (from 44100 to 16000) on the fly. Also the library seems native and does not spawn a separate process like pydub.AudioSegment. Another bonus is there are no OS dependency, like ffmpeg. miniaudio uses C lib (https://miniaud.io/). I definitely recommend looking into this as a new backend.
For those who wishes to see a working code, here it is:
MP3 file: common_voice_en_20603299.zip
Plot output: