question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Why does the mel.npy file I trained with the tacotron2 model not match the dimensions in hifi-gan?

See original GitHub issue

I trained the tacotron2 model to produce mel_**.npy ,but in this model, the error of dimension mismatch is reported python3 inference_e2e.py --checkpoint_file cp_hifigan-1208-test/g_00036000 Initializing Inference Process.. cp_hifigan-1208-test/g_00036000 Loading 'cp_hifigan-1208-test/g_00036000' Complete. Removing weight norm... Traceback (most recent call last): File "inference_e2e.py", line 90, in <module> main() File "inference_e2e.py", line 86, in main inference(a) File "inference_e2e.py", line 51, in inference y_g_hat = generator(x) File "/home/zhchen/python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/zhchen/hifi-gan-master/models.py", line 101, in forward x = self.conv_pre(x) File "/home/zhchen/python3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 547, in __call__ result = self.forward(*input, **kwargs) File "/home/zhchen/python3/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 200, in forward self.padding, self.dilation, self.groups) RuntimeError: Expected 3-dimensional input for 3-dimensional weight 128 80, but got 2-dimensional input of size [288, 80] instead

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (2 by maintainers)

github_iconTop GitHub Comments

3reactions
Miralancommented, Dec 10, 2020

It is because of the shape of the input data. Insert the following code at line 50 of inference_e2e.py file and try again.

        if len(x.shape) < 3:
            x = x.unsqueeze(0)

We will update it soon.

Maybe below better

          if len(x.shape) < 3:
              x = x.unsqueeze(0)
          if not x.shape[1] == h.num_mels:
              x = x.tranpose(1, 2) 
0reactions
McTicommented, May 18, 2021

Just a heads up that there’s a typo in that code snippet… tranpose -> transpose

Should be:

          if len(x.shape) < 3:
              x = x.unsqueeze(0)
          if not x.shape[1] == h.num_mels:
              x = x.transpose(1, 2) 

Read more comments on GitHub >

github_iconTop Results From Across the Web

Convert a .npy file to wav following tacotron2 training
I am trying to find a way to convert said files to a .wav file in order to check if my work has...
Read more >
What are the TTS models you know to be faster than Tacotron?
I tried to use hifiGAN, but it looks like the Mozilla TTS spectrogram is not compatible with it, as it is and it...
Read more >
Tacotron2SpeechSynthesisDem...
Tacotron 2 Training and Synthesis Notebook originally based on the following ... WaveGlow model weights pre-trained on the LJ Speech dataset can be ......
Read more >
init · NATSpeech/DiffSpeech at d1b91e7 - Hugging Face
Files changed (50) hide show .gitattributes +1 -0 .gitignore +148 -0; README.md +9 -0; checkpoints/fs2_exp/config.yaml +219 -0 ...
Read more >
School School of Computer Science & Robotics Academic ...
atmosphere, hydrosphere does not occur. ... In model training, the autoregressive end-to-end model (Tacotron2,. Transformer) is first used ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found