Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Discussion] Spleeter for real-time applications

See original GitHub issue

Hi ! This issue is a follow-up of a gitter discussion. I’m developing a C++ port of spleeter here as a side project. My goal is to give people the opportunity to use the spleeter technology within plug-ins. Lately, I realized that the architecture is built to run in batches of size ‘T’ which is 512 fft frames in the pre-trained models (~12seconds). This kind of latency isn’t suitable for real-time processes. To check if the architecture is suitable for my needs, I need to evaluate the quality loss of changing that value to something lower. My plan is to train models for 2 and 4 stems with multiple values of T and compare their quality.

I created a repository to report on that, as I assume it may be of interest to others. And also to get some help in the process.

I already have a couple of questions: First: I am using MUSDB18. I noticed that the training configuration needs the description of the evaluation set. Considering that I will be using that set to evaluate the trained models, I would like to make sure that training does not use the validation set as it would mean that I am evaluating on data the network already knows. Is the evaluation_csv parameter used during training ?

Second: I am struggling with computation power. For my first test, it took almost 4hours to run 100 000 steps on a p2.xlarge AWS instance (GPU Tesla k80). Is this expected ? Do you think that, considering my problem I could lower the train_max_steps ?

Third: This one is more of a note than a discussion but I ran the evaluation using python -m spleeter evaluate -p spleeter:2stems --mus_dir [exported db dir] -o [local path] and got NaNs. Is the evaluation system broken atm ? I changed it in a fork here to fix it for my case. Should I do a pull request for that ?

Thank you for your help and for publishing your work !

Issue Analytics

State:
Created 4 years ago
Comments:43 (1 by maintainers)

Top GitHub Comments

2reactions

gvnecommented, Apr 4, 2020

Hi there, just wanted to follow up on that matter. I followed @romi1502 's suggestion and successfully implemented a very simple volume control VST3 plugin running in real time with spleeter ! You can find the plug-in code right here. I also provide a pre-build for OSX (tested on 10.14 and 10.15) here.

As expected the latency is the worst. 64 frames for spleeter and a couple of extra frames to leave enough time for the process to run properly (setup to 10 on the pre-built if I’m not mistaking). That leads to a latency close to 2seconds… But still, playing with those sliders is so much fun ! 😃

As a side note, I didn’t release a new spleeterpp version that includes the on-line processing yet (the code is available in the develop branch though). I still need to update the documentation with details about the algorithm. There are quite a few parameters after all. I also need to assert the equality with the classic process.

Anyway, thank you once again for releasing your work. If you ever have further suggestions to improve this integration, I’d be very happy to read them !

1reaction

junh1024commented, Apr 10, 2020

S1 should be OK. Accordingly to https://github.com/deezer/spleeter/blob/master/configs/5stems/base_config.json#L8 , the FFT size is 4096, so the latency should be 4096sa or 4096/44100 = 92ms. Can you explain how you got it to 8ms?

I;m not experienced with PDC in juice/c++, but in JSFX, PDC is very manual. You set the PDC & the DAW will give you x sa in advance, and you delay your output, and give out 1 sa at a time.

I don’t think it’s a revolution, more like workflow enhancement. BTW, izotope had a AAX “RX7 music rebalance” in 2018 & VST "Ozone 9 master rebalance " in 2019.