Improving config for MUSDB18
See original GitHub issueDescription
training spleeter on MUSDB18 using the provided configuration file does produce very poor results. This is probably because the config was designed for evaluation only.
When using the provided config, I get the following results with museval
vocals ==> SDR: 1.058 SIR: -5.229 ISR: 2.040 SAR: 12.087
drums ==> SDR: 1.205 SIR: -3.945 ISR: 1.987 SAR: 12.087
bass ==> SDR: 0.680 SIR: -6.822 ISR: 1.964 SAR: 12.087
other ==> SDR: 1.063 SIR: -5.320 ISR: 1.984 SAR: 12.087
Step to reproduce
python -m spleeter train -p configs/musdb_config.json -d MUSDB18-WAV
Questions
To improve the config, I think the following things would need to be addressed:
n_chunks_per_song
is set to 1. Shouldn’t this be larger for full tracks like in MUSDB?random_time_crop
is using a fixed seed which is not updated during training. That means the chunks are deterministic and therefore only a very small fraction of the MUSDB18 dataset is actually used in training.train_max_steps
is set to100000
. Was this tested on MUSDB18? Should this be increased?- No early stopping is configures. On such a small dataset the model will suffer from significant overfitting
Desired behavior is to update the config/docs to be able to train using MUSDB18.
Issue Analytics
- State:
- Created 4 years ago
- Reactions:3
- Comments:15 (8 by maintainers)
Top Results From Across the Web
The MUSDB18 dataset - GitHub Pages
MUSDB18 is a dataset of 150 full length music tracks (~10h total duration) of varying genres. For each track it provides a mixture...
Read more >sigsep/open-unmix - Gitter
Recently I've been trying to measure the impact of the amount of training data by using bigger private datasets (good quality, ~10 times...
Read more >arXiv:2108.13559v3 [eess.AS] 23 May 2022
MUSDB18 test data, however, it is unclear if generaliza- tion performance did improve at the same pace or if some models overfit on...
Read more >MUSDB18 Dataset - Papers With Code
The MUSDB18 is a dataset of 150 full lengths music tracks (~10h duration) ... All for One and One for All: Improving Music...
Read more >Cutting Music Source Separation Some Slakh: A Dataset to ...
We see that augment- ing MUSDB18 leads to improvement over the non-augmented case, with Slakh2100 performing better than Flakh2100, showing that ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Hey @syxu828 It is probably because the training didn’t go through at all, at least that’s how it was in my case. Can you try running the spleeter train command with ‘–verbose’ added as an argument and look at the log messages at the end? I get something like this -
WARNING:tensorflow:Training with estimator made no steps. Perhaps input is empty or misspecified.
I suspect that the audio isn’t getting loaded, but I haven’t yet been able to fix it.
I think the issue is fixed now in the currently available version. I have since been able to train successfully on musdb.