unsupervised model training.
See original GitHub issueWhile trying to train custom models using the unsupervised pre-trained models(hubert and w2v), the training is throwing errors.
I am using the old version of espnet (espnet 0.10.4a1). Is this an issue as such? should I move to the latest codebase for this to work properly? Could this be because of the incompatible espnet interface code with s3prl?
Sample errors:
Hubert model:
File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 303, in __getattr__ return self._get_impl(key=key, default_value=DEFAULT_VALUE_MARKER) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 361, in _get_impl node = self._get_node(key=key) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 383, in _get_node self._validate_get(key) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 135, in _validate_get self._format_and_raise( File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/base.py", line 95, in _format_and_raise format_and_raise( File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/_utils.py", line 694, in format_and_raise _raise(ex, cause) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace **omegaconf.errors.ConfigAttributeError: Key 'required_seq_len_multiple' not in 'HubertConfig' full_key: required_seq_len_multiple reference_type=Optional[HubertConfig] object_type=HubertConfig** Using cache found in ./hub/s3prl_cache/4a54d64fa42b41e39db994c958d8107d5785a100f38c6eba680b6a3cc79babb3 for https://dl.fbaipublicfiles.com/hubert/hubert_large_ll60k.pt
For wav2vec:
[50df66fda69a] 2022-01-28 10:35:45,723 (asr:386) INFO: Vocabulary size: 5000 [Featurizer] - Take a list of 25 features and weighted sum them. Traceback (most recent call last): File "/miniconda/envs/espnet/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/miniconda/envs/espnet/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/opt/espnet2/espnet2/bin/asr_train.py", line 23, in <module> main() File "/opt/espnet2/espnet2/bin/asr_train.py", line 19, in main ASRTask.main(cmd=cmd) File "/opt/espnet2/espnet2/tasks/abs_task.py", line 1007, in main cls.main_worker(args) File "/opt/espnet2/espnet2/tasks/abs_task.py", line 1109, in main_worker model = cls.build_model(args=args) File "/opt/espnet2/espnet2/tasks/asr.py", line 392, in build_model frontend = frontend_class(**args.frontend_conf) File "/opt/espnet2/espnet2/asr/frontend/s3prl.py", line 49, in __init__ self.upstream, self.featurizer = self._get_upstream(frontend_conf) File "/opt/espnet2/espnet2/asr/frontend/s3prl.py", line 90, in _get_upstream s3prl_featurizer = Featurizer( File "/opt/espnet2/tools/s3prl/s3prl/upstream/interfaces.py", line 178, in __init__ feature = self._weighted_sum([f.cpu() for f in feature]) File "/opt/espnet2/tools/s3prl/s3prl/upstream/interfaces.py", line 231, in _weighted_sum stacked_feature = torch.stack(feature, dim=0) **RuntimeError: stack expects each tensor to be equal size, but got [1, 50, 1024] at entry 0 and [1, 49, 1024] at entry 24** Using cache found in ./hub/s3prl_cache/0edc65775079e001501f97abaa69ef5ec67efb9d163d425f59a9bf3c71c3802a for https://dl.fbaipublicfiles.com/fairseq/wav2vec/libri960_big.pt
PFB lib versions: `Nvidia driver version: 465.19.01 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5 HIP runtime version: N/A MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] pytorch-lightning==1.4.9
[pip3] pytorch-ranger==0.1.1
[pip3] pytorch-wpe==0.0.1
[pip3] torch==1.10.0
[pip3] torch-complex==0.2.1
[pip3] torch-optimizer==0.1.0
[pip3] torch-stoi==0.1.2
[pip3] torchaudio==0.10.0
[pip3] torchmetrics==0.7.0
[conda] blas 1.0 mkl
[conda] cudatoolkit 11.1.74 h6bb024c_0 nvidia
[conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.21.2 py38h20f2e39_0
[conda] numpy-base 1.21.2 py38h79a1101_0
[conda] pytorch 1.10.0 py3.8_cuda11.1_cudnn8.0.5_0 pytorch
[conda] pytorch-lightning 1.4.9 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] pytorch-ranger 0.1.1 pypi_0 pypi
[conda] pytorch-wpe 0.0.1 pypi_0 pypi
[conda] torch-complex 0.2.1 pypi_0 pypi
[conda] torch-optimizer 0.1.0 pypi_0 pypi
[conda] torch-stoi 0.1.2 pypi_0 pypi
[conda] torchaudio 0.10.0 py38_cu111 pytorch
[conda] torchmetrics 0.7.0 pypi_0 pypi`
Issue Analytics
- State:
- Created 2 years ago
- Comments:10
Top GitHub Comments
Oh, that is because this config was not designed for global_mvn. You should specify
--feats_normalize utt_mvn
in run.sh.yes, I have unused_parameters in my config. I am using the latest config only.