Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why does the mel.npy file I trained with the tacotron2 model not match the dimensions in hifi-gan?

See original GitHub issue

I trained the tacotron2 model to produce mel_**.npy ,but in this model, the error of dimension mismatch is reported python3 inference_e2e.py --checkpoint_file cp_hifigan-1208-test/g_00036000 Initializing Inference Process.. cp_hifigan-1208-test/g_00036000 Loading 'cp_hifigan-1208-test/g_00036000' Complete. Removing weight norm... Traceback (most recent call last): File "inference_e2e.py", line 90, in <module> main() File "inference_e2e.py", line 86, in main inference(a) File "inference_e2e.py", line 51, in inference y_g_hat = generator(x) File "/home/zhchen/python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/zhchen/hifi-gan-master/models.py", line 101, in forward x = self.conv_pre(x) File "/home/zhchen/python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/zhchen/python3/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 200, in forward self.padding, self.dilation, self.groups) RuntimeError: Expected 3-dimensional input for 3-dimensional weight 128 80, but got 2-dimensional input of size [288, 80] instead

Issue Analytics

State:
Created 3 years ago
Comments:5 (2 by maintainers)

Top GitHub Comments

3reactions

Miralancommented, Dec 10, 2020

It is because of the shape of the input data. Insert the following code at line 50 of inference_e2e.py file and try again.
        if len(x.shape) < 3:
            x = x.unsqueeze(0)
We will update it soon.

Maybe below better

          if len(x.shape) < 3:
              x = x.unsqueeze(0)
          if not x.shape[1] == h.num_mels:
              x = x.tranpose(1, 2)

0reactions

McTicommented, May 18, 2021

Just a heads up that there’s a typo in that code snippet… tranpose -> transpose

Should be:

          if len(x.shape) < 3:
              x = x.unsqueeze(0)
          if not x.shape[1] == h.num_mels:
              x = x.transpose(1, 2)

Top Results From Across the Web

Convert a .npy file to wav following tacotron2 training

I am trying to find a way to convert said files to a .wav file in order to check if my work has...

What are the TTS models you know to be faster than Tacotron?

I tried to use hifiGAN, but it looks like the Mozilla TTS spectrogram is not compatible with it, as it is and it...

Tacotron2SpeechSynthesisDem...

Tacotron 2 Training and Synthesis Notebook originally based on the following ... WaveGlow model weights pre-trained on the LJ Speech dataset can be ......

init · NATSpeech/DiffSpeech at d1b91e7 - Hugging Face

Files changed (50) hide show .gitattributes +1 -0 .gitignore +148 -0; README.md +9 -0; checkpoints/fs2_exp/config.yaml +219 -0 ...

School School of Computer Science & Robotics Academic ...

atmosphere, hydrosphere does not occur. ... In model training, the autoregressive end-to-end model (Tacotron2,. Transformer) is first used ...