question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FineTuning HiFi with GLowTTS npy

See original GitHub issue

Hello! I’m trying to FineTuning HiFi with GlowTTS npy i generate npy with this code:

def TTS(tst_stn, path):
    if getattr(hps.data, "add_blank", False):
        text_norm = text_to_sequence(tst_stn.strip(), ['english_cleaners'], cmu_dict)
        text_norm = commons.intersperse(text_norm, len(symbols))
    else: 
        tst_stn = " " + tst_stn.strip() + " "
        text_norm = text_to_sequence(tst_stn.strip(), ['english_cleaners'], cmu_dict)
    sequence = np.array(text_norm)[None, :]
    x_tst = torch.autograd.Variable(torch.from_numpy(sequence)).cuda().long()
    x_tst_lengths = torch.tensor([x_tst.shape[1]]).cuda()
    

   with torch.no_grad():
        noise_scale = 0.667
        length_scale = 1.0
        (y_gen_tst, *_), *_, (attn_gen, *_) = model(x_tst, x_tst_lengths, gen=True, noise_scale=noise_scale, length_scale=length_scale)
        
    np.save("hf/ft_dataset/" + path.split('/')[1]  + '.npy', y_gen_tst.cpu().detach().numpy())

Next, I make a metafile: wavs/x.wav | ft_dataset/x.npy

And I get the following error: RuntimeError: stack expects each tensor to be equal size, but got [8192] at entry 0 and [6623] at entry 6

Hi-Fi generates wav using these npy in inference mode with GlowTTS

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:8 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
jik876commented, Dec 16, 2020

@4nton-P

Hello. To get mel-spectrograms for fine tuning, you need to make some changes to the code. If you set the ‘gen’ argument to True, the length of the generated mel-spectrogram may not match the length of the ground truth audio. In the branch where ‘gen’ of the forward operation is False, there is a part that generates mean and variance using the output of the encoder and the output of the decoder. If you use these to sample z from Gaussian and feed it to the decoder with ‘reverse=True’, you will get the desired result. See lines 313 and 299 in models.py. And ‘noise_scale’ can affect the quality. You will get good results with the default settings, but experimenting with various ‘noise_scale’ would be a good try.

1reaction
Rashi2011commented, Jun 6, 2021

I am taking mels from fastspeech2 and trying to input it to hifigan to generate audio but I am getting noise in the audio file . I made it shape compatible but there are problems internally . please share your idea that I can try.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Fine-tuning a TTS model - TTS 0.10.0 documentation
Fine-tuning takes a pre-trained model, and retrains it to improve the model performance on a different task or dataset. In TTS we provide...
Read more >
What are the TTS models you know to be faster than Tacotron?
I tried to use hifiGAN, but it looks like the Mozilla TTS spectrogram is not compatible with it, as it is and it...
Read more >
hifi-gan - PyPI
In our paper, we proposed HiFi-GAN: a GAN-based model capable of generating high fidelity speech efficiently.
Read more >
coqui-ai/TTS: v0.0.13 - Zenodo
Glow-TTS updates to import SC-Glow Models. Fixing windows support (:crown: @WeberJulian ) ... HiFiGAN vocoder finetuned for the above model.
Read more >
hifi-gan - PyDigger
HiFi -GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech ... should match the audio file and the extension should be `.npy`....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found