Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

unsupervised model training.

See original GitHub issue

While trying to train custom models using the unsupervised pre-trained models(hubert and w2v), the training is throwing errors.

I am using the old version of espnet (espnet 0.10.4a1). Is this an issue as such? should I move to the latest codebase for this to work properly? Could this be because of the incompatible espnet interface code with s3prl?

Sample errors: Hubert model: File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 303, in __getattr__ return self._get_impl(key=key, default_value=DEFAULT_VALUE_MARKER) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 361, in _get_impl node = self._get_node(key=key) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 383, in _get_node self._validate_get(key) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/dictconfig.py", line 135, in _validate_get self._format_and_raise( File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/base.py", line 95, in _format_and_raise format_and_raise( File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/_utils.py", line 694, in format_and_raise _raise(ex, cause) File "/miniconda/envs/espnet/lib/python3.8/site-packages/omegaconf/_utils.py", line 610, in _raise raise ex # set end OC_CAUSE=1 for full backtrace **omegaconf.errors.ConfigAttributeError: Key 'required_seq_len_multiple' not in 'HubertConfig' full_key: required_seq_len_multiple reference_type=Optional[HubertConfig] object_type=HubertConfig** Using cache found in ./hub/s3prl_cache/4a54d64fa42b41e39db994c958d8107d5785a100f38c6eba680b6a3cc79babb3 for https://dl.fbaipublicfiles.com/hubert/hubert_large_ll60k.pt

For wav2vec: [50df66fda69a] 2022-01-28 10:35:45,723 (asr:386) INFO: Vocabulary size: 5000 [Featurizer] - Take a list of 25 features and weighted sum them. Traceback (most recent call last): File "/miniconda/envs/espnet/lib/python3.8/runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "/miniconda/envs/espnet/lib/python3.8/runpy.py", line 87, in _run_code exec(code, run_globals) File "/opt/espnet2/espnet2/bin/asr_train.py", line 23, in <module> main() File "/opt/espnet2/espnet2/bin/asr_train.py", line 19, in main ASRTask.main(cmd=cmd) File "/opt/espnet2/espnet2/tasks/abs_task.py", line 1007, in main cls.main_worker(args) File "/opt/espnet2/espnet2/tasks/abs_task.py", line 1109, in main_worker model = cls.build_model(args=args) File "/opt/espnet2/espnet2/tasks/asr.py", line 392, in build_model frontend = frontend_class(**args.frontend_conf) File "/opt/espnet2/espnet2/asr/frontend/s3prl.py", line 49, in __init__ self.upstream, self.featurizer = self._get_upstream(frontend_conf) File "/opt/espnet2/espnet2/asr/frontend/s3prl.py", line 90, in _get_upstream s3prl_featurizer = Featurizer( File "/opt/espnet2/tools/s3prl/s3prl/upstream/interfaces.py", line 178, in __init__ feature = self._weighted_sum([f.cpu() for f in feature]) File "/opt/espnet2/tools/s3prl/s3prl/upstream/interfaces.py", line 231, in _weighted_sum stacked_feature = torch.stack(feature, dim=0) **RuntimeError: stack expects each tensor to be equal size, but got [1, 50, 1024] at entry 0 and [1, 49, 1024] at entry 24** Using cache found in ./hub/s3prl_cache/0edc65775079e001501f97abaa69ef5ec67efb9d163d425f59a9bf3c71c3802a for https://dl.fbaipublicfiles.com/fairseq/wav2vec/libri960_big.pt

PFB lib versions: `Nvidia driver version: 465.19.01 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5 HIP runtime version: N/A MIOpen runtime version: N/A

Versions of relevant libraries: [pip3] numpy==1.21.2 [pip3] pytorch-lightning==1.4.9 [pip3] pytorch-ranger==0.1.1 [pip3] pytorch-wpe==0.0.1 [pip3] torch==1.10.0 [pip3] torch-complex==0.2.1 [pip3] torch-optimizer==0.1.0 [pip3] torch-stoi==0.1.2 [pip3] torchaudio==0.10.0 [pip3] torchmetrics==0.7.0 [conda] blas 1.0 mkl
[conda] cudatoolkit 11.1.74 h6bb024c_0 nvidia [conda] mkl 2021.4.0 h06a4308_640
[conda] mkl-service 2.4.0 py38h7f8727e_0
[conda] mkl_fft 1.3.1 py38hd3c417c_0
[conda] mkl_random 1.2.2 py38h51133e4_0
[conda] numpy 1.21.2 py38h20f2e39_0
[conda] numpy-base 1.21.2 py38h79a1101_0
[conda] pytorch 1.10.0 py3.8_cuda11.1_cudnn8.0.5_0 pytorch [conda] pytorch-lightning 1.4.9 pypi_0 pypi [conda] pytorch-mutex 1.0 cuda pytorch [conda] pytorch-ranger 0.1.1 pypi_0 pypi [conda] pytorch-wpe 0.0.1 pypi_0 pypi [conda] torch-complex 0.2.1 pypi_0 pypi [conda] torch-optimizer 0.1.0 pypi_0 pypi [conda] torch-stoi 0.1.2 pypi_0 pypi [conda] torchaudio 0.10.0 py38_cu111 pytorch [conda] torchmetrics 0.7.0 pypi_0 pypi`