question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] Spleeter adding small padding to output audio files

See original GitHub issue

Description

During an effort to reduce memory footprint by splitting input files in chunks of 30 seconds, discussed on this thread we noticed that Spleeter is adding a tiny padding after each output stem file, what makes a small gap when stitching back the 30’s chunks in one single stem. Sometimes this gap can be unnoticeable, but when processing a song and mixing it back, it is easy to spot the hiccup in the song. Also, after analyzing the waveform, it’s clear that a gap is added by Spleeter:

image

In order to make sure it is related to Spleeter, I’ve tried separating and stitching other files not processed via Spleeter and the stitching was flawless. During the entire experiment, I’ve used only lossless(wav) files to avoid issues with padding that some lossy files would cause.

Here is the file that generated the waveform above, you can notice a hiccup (gap) every 30 seconds when listening carefully.

Step to reproduce

1 - Use an example wav file that has more than 30 seconds and split it into 30s chunks using FFmpeg or Sox. You can rename your file to myfile.wav to reuse the code below:

FFmpeg: ffmpeg -i myfile.wav -f segment -segment_time 30 -c copy myfile-%03d.wav Sox: sox myfile.wav myfile-.wav trim 0 30 : newfile : restart

2 - Process all the chunks using Spleeter:

spleeter separate -i myfile-* -p spleeter:2stems -B tensorflow -o out

3 - Move first 2 accompaniment stems together for stitching:

mv ./out/myfile-002/accompaniment.wav ./out/myfile-001/accompaniment2.wav
cd ./out/myfile-001

4 - Stitch accompaniment.wav and accompaniment2.wav using Sox or FFmpeg:

FFmpeg: ffmpeg -f concat -safe 0 -i <(for f in ./accompaniment*.wav; do echo "file '$PWD/$f'"; done) -c copy output.wav Sox: sox accompaniment.wav accompaniment2.wav output.wav

5 - Listen to output.wav and notice the hiccup during the transition at ~30s.

You can also use this shell script by @amo13

Environment

OS Linux using Docker
Installation type Conda
RAM available 6GB
Hardware spec Docker using 8 CPUs

Additional context

Stitching discussion

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Reactions:6
  • Comments:6 (1 by maintainers)

github_iconTop GitHub Comments

3reactions
romi1502commented, Jul 1, 2020

Hi @geraldoramos, thank you for the detailed issue. Yes there is indeed an issue at the beginning of reconstructed signals. This is due to a strange behavior of the STFT of tensorflow that spleeter does not compensate: the first window of the STFT starts at the first sample while it should be centered on the first sample. To compensate for that, we should pad the beginning of the input of the STFT with zeros and remove the padded portion after separation. This is usually not a big deal if you process a full track (as songs commonly have already a bit of silence at the beginning).I’ll have a look for a quick fix on this aspect of the issue.

That being said, even after solving this first problem, you’ll still have troubles at borders if you try to process chunked segment of audio without doing overlap: this is inherent to STFT processing with overlap and add reconstruction. The result of the first chunk can actually leak a bit on the next chunk and if you don’t take this into account, you may still have glitches. So if you want no glitch, you need to do a bit of overlap between your chunks (which, by the way, will solve the first problem too).

2reactions
amo13commented, May 6, 2021

Good call! I thought about it a while back and wasn’t sure if anyone would use this anyway…

Here it is: https://github.com/amo13/spleeter-wrapper

Read more comments on GitHub >

github_iconTop Results From Across the Web

10 Easy Tips to Improve Audio Quality - Copyblogger
10 pro tips to improve audio quality and help your podcast stand out from the majority of the audio content available on the...
Read more >
- Padding
Audio & Graphics API ... Because the size of some data (e.g., plaintext) can be variable, padding can be added to meet the...
Read more >
VLC command-line help - VideoLAN Wiki
Here's the output of vlc -H of vlc-4.0.0-dev under Windows. If the text is too small, pressing Ctrl and + together in most...
Read more >
16.8. Changelog — Mixxx User Manual
Shade: Fix library sidebar splitter glitch #4828 lp:1979823 ... Add FFmpeg audio decoder, bringing support for ALAC files #1356.
Read more >
foobar2000 change log - version 1.5.5
Fixed a long-time bug with erratic behavior of audio output device selection. ... Added %video_codec% info for Matroska and MP4 files.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found