Sorry,a newbie here asking basic setup infos
See original GitHub issuei tried to setup a proper env to generate the demos. But somehow have some errors.
(```
Hifigan384test) D:\Coding\PYFastCache\PYVenv\Hifigan384test\hifi-gan-master>python inference.py --checkpoint_file vctk_v2\generator_v2 --input_wavs_dir test_files
Initializing Inference Process…
Loading ‘vctk_v2\generator_v2’
Complete.
Removing weight norm…
D:\Coding\PYFastCache\PYVenv\Hifigan384test\hifi-gan-master\meldataset.py:15: WavFileWarning: Chunk (non-data) not understood, skipping it.
sampling_rate, data = read(full_path)
Traceback (most recent call last):
File “inference.py”, line 94, in <module>
main()
File “inference.py”, line 90, in main
inference(a)
File “inference.py”, line 54, in inference
x = get_mel(wav.unsqueeze(0))
File “inference.py”, line 26, in get_mel
return mel_spectrogram(x, h.n_fft, h.num_mels, h.sampling_rate, h.hop_size, h.win_size, h.fmin, h.fmax)
File “D:\Coding\PYFastCache\PYVenv\Hifigan384test\hifi-gan-master\meldataset.py”, line 61, in mel_spectrogram
y = torch.nn.functional.pad(y.unsqueeze(1), (int((n_fft-hop_size)/2), int((n_fft-hop_size)/2)), mode=‘reflect’)
File “D:\Coding\PYFastCache\PYVenv\Hifigan384test\lib\site-packages\torch\nn\functional.py”, line 2877, in pad
assert len(pad) == 4, ‘4D tensors expect 4 values for padding’
AssertionError: 4D tensors expect 4 values for padding
not quite sure what went wrong.Maybe the audio?
here's the layout.
02/12/2020 18:04 <DIR> .
02/12/2020 18:04 <DIR> ..
30/11/2020 23:03 762 config_v1.json
30/11/2020 23:03 762 config_v2.json
30/11/2020 23:03 752 config_v3.json
30/11/2020 23:03 394 env.py
02/12/2020 18:00 <DIR> generated_files
02/12/2020 16:59 <DIR> generated_files_from_mel
30/11/2020 23:03 2,652 inference.py
30/11/2020 23:03 2,444 inference_e2e.py
30/11/2020 23:03 1,067 LICENSE
30/11/2020 23:03 <DIR> LJSpeech-1.1
30/11/2020 23:03 6,314 meldataset.py
30/11/2020 23:03 9,905 models.py
30/11/2020 23:03 4,767 README.md
30/11/2020 23:03 113 requirements.txt
02/12/2020 18:04 <DIR> test_files
30/11/2020 23:03 12,153 train.py
30/11/2020 23:03 1,377 utils.py
30/11/2020 23:03 10,995 validation_loss.png
02/12/2020 16:53 <DIR> vctk_v2
02/12/2020 16:59 <DIR> __pycache__
14 File(s) 54,457 bytes
8 Dir(s) 7,409,783,296 bytes free
btw,for the audio i just recorded my voice.Not quite sure.What is needed for input.
I assume it just needs some wav audio?
Issue Analytics
- State:
- Created 3 years ago
- Reactions:1
- Comments:5 (1 by maintainers)

Top Related StackOverflow Question
Hmmm… maybe I will do a pr to include a note for the.audio specs.Well maybe can prevent someone as stupid as me uses a mono audio and resample rate.hahaha had been using it to record instruments.forget to change those settings.Thanks for the guide.Everyone.I did generate the audio. I will research more about how to give text input or mel spectogram.
ahhh stereo…i see ,okok sorry sorry.should we also add that to readme then?