[Bug] Multispeaker VITS training does not work in direct python but does with config.json
See original GitHub issueDescribe the bug
Hi,
I want to train VITS model with mutliple speakers and external embeddings (aka d_vectors). So I provided VitsArgs in a recipe :
vits_args = VitsArgs(
use_language_embedding=False,
embedded_language_dim=1,
use_speaker_embedding=False,
num_languages=1,
use_sdp=False,
#Those 3 properties also have to be repeated in the config section (see https://github.com/coqui-ai/TTS/issues/1454#issuecomment-1081843205)
use_d_vector_file=True,
d_vector_file="/home/Caraduf/Models/d_vector_file_4_Voices.json",
d_vector_dim=512
)
and repeated the same 3 lines in the VitsConfig section as explained here :
config = VitsConfig(
model_args = vits_args,
use_d_vector_file=True,
d_vector_file="/home/Caraduf/Models/d_vector_file_4_Voices.json",
d_vector_dim=512
I then ran the training python3 my_multispeaker_vits_training.py
but it failed due to
line 303, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
TypeError: conv1d() received an invalid combination of arguments - got (NoneType, Parameter, Parameter, tuple, tuple, tuple, int), but expected one of:
* (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, tuple of ints padding, tuple of ints dilation, int groups)
didn't match because some of the arguments have invalid types: (NoneType, Parameter, Parameter, tuple, tuple, tuple, int)
* (Tensor input, Tensor weight, Tensor bias, tuple of ints stride, str padding, tuple of ints dilation, int groups)
didn't match because some of the arguments have invalid types: (NoneType, Parameter, Parameter, tuple, tuple, tuple, int)
This is also described by @harmlessman in this comment.
But if I run the train_tts.py --config_path path/to/the/jus/previously/created/config.json
then the multispeaker training with external embeddings runs fine.
To Reproduce
See above.
Expected behavior
Launching a multi speaker training via “direct python way” providing the external speaker embeddings should work directly without the need to use the generated config.json file via generic python3 training_tts.py --config_path X/Y/Z/config.json
.
Logs
No response
Environment
CoquiTTS 0.8.0
Additional context
No response
Issue Analytics
- State:
- Created a year ago
- Comments:5
Top GitHub Comments
Hi! Did you find a solution for an issue? I’m facing the same one unfortunately.
That was what I was missing ! Thanks for sharing your solution @lokmantsui this is much better than my workaround !