question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] ValueError: num_samples should be a positive integer value, but got num_samples=0 training YourTTS from checkpoint

See original GitHub issue

🐛 Description

Hi! I am trying to train the YourTTS model, obtained from the archive linked here https://github.com/Edresson/YourTTS/issues/8, adding the italian language with the mailabs dataset. I have prepared the datasets successfully and extracted the d_vectors file and adjusted the config accordingly. However, my training crashes:

> EPOCH: 0/1000
 --> ../checkpoints/vits_tts-quadruplo-February-23-2022_04+50PM-c63bb481

 > DataLoader initialization
 | > Use phonemes: False
 | > Number of instances : 152781
 | > Max length sequence: 706582.0
 | > Min length sequence: 4822.0
 | > Avg length sequence: 83586.53155169818
 | > Num. instances discarded by max-min (max=270, min=90) seq limits: 152781
 | > Batch group size: 0.
 > Using Language weighted sampler
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/TTS/trainer.py", line 1007, in fit
    self._fit()
  File "/opt/conda/lib/python3.8/site-packages/TTS/trainer.py", line 992, in _fit
    self.train_epoch()
  File "/opt/conda/lib/python3.8/site-packages/TTS/trainer.py", line 801, in train_epoch
    self.train_loader = self.get_train_dataloader(
  File "/opt/conda/lib/python3.8/site-packages/TTS/trainer.py", line 498, in get_train_dataloader
    return self._get_loader(self.model, self.config, training_assets, False, data_items, verbose, self.num_gpus)
  File "/opt/conda/lib/python3.8/site-packages/TTS/trainer.py", line 484, in _get_loader
    loader = model.get_data_loader(config, assets, is_eval, data_items, verbose, num_gpus)
  File "/opt/conda/lib/python3.8/site-packages/TTS/tts/models/base_tts.py", line 360, in get_data_loader
    sampler = get_language_weighted_sampler(dataset.items)
  File "/opt/conda/lib/python3.8/site-packages/TTS/tts/utils/languages.py", line 122, in get_language_weighted_sampler
    return WeightedRandomSampler(dataset_samples_weight, len(dataset_samples_weight))
  File "/opt/conda/lib/python3.8/site-packages/torch/utils/data/sampler.py", line 175, in __init__
    raise ValueError("num_samples should be a positive integer "
ValueError: num_samples should be a positive integer value, but got num_samples=0

Apparently the pytorch dataloader was initialized correctly since it has 152781 instances.

To Reproduce

CUDA_VISIBLE_DEVICES=“0” python TTS/bin/train_tts.py --restore_path ./datasets/model_file.pth.tar --config_path ./datasets/config.json

in the configs.json:

 "model": "vits",
    "run_name": "vits_tts-quadruplo",
    "run_description": "",
    "epochs": 1000,
    "batch_size": 52,
    "eval_batch_size": 52,
    "mixed_precision": false,
    "scheduler_after_epoch": true,
    "run_eval": true,
    "test_delay_epochs": -1,
    "print_eval": true,
    "dashboard_logger": "tensorboard",
    "print_step": 25,
    "plot_step": 100,
    "model_param_stats": false,
    "project_name": null,
    "log_model_step": 10000,
    "wandb_entity": null,
    "save_step": 10000,
    "checkpoint": true,
    "keep_all_best": false,
    "keep_after": 10000,
    "num_loader_workers": 4,
    "num_eval_loader_workers": 4,
    "use_noise_augment": false,
    "use_language_weighted_sampler": true,
    "output_path": "../checkpoints/",
    "audio": {
        "fft_size": 1024,
        "win_length": 1024,
        "hop_length": 256,
        "frame_shift_ms": null,
        "frame_length_ms": null,
        "stft_pad_mode": "reflect",
        "sample_rate": 16000,
        "resample": false,
        "preemphasis": 0.0,
        "ref_level_db": 20,
        "do_sound_norm": false,
        "log_func": "np.log",
        "do_trim_silence": true,
        "trim_db": 45,
        "power": 1.5,
        "griffin_lim_iters": 60,
        "num_mels": 80,
        "mel_fmin": 0.0,
        "mel_fmax": null,
        "spec_gain": 1,
        "do_amp_to_db_linear": false,
        "do_amp_to_db_mel": true,
        "signal_norm": false,
        "min_level_db": -100,
        "symmetric_norm": true,
        "max_norm": 4.0,
        "clip_norm": true,
        "stats_path": null
    },
    "use_phonemes": false,
    "use_espeak_phonemes": false,
    "phoneme_language": "pt-br",
    "compute_input_seq_cache": false,
    "text_cleaner": "multilingual_cleaners",
    "enable_eos_bos_chars": false,
    "test_sentences_file": "",
    "phoneme_cache_path": null,
    "characters": {
        "pad": "_",
        "eos": "&",
        "bos": "*",
        "characters": "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz\u00af\u00b7\u00df\u00e0\u00e1\u00e2\u00e3\u00e4\u00e6\u00e7\u00e8\u00e9\u00ea\u00eb\u00ec\u00ed\u00ee\u00ef\u00f1\u00f2\u00f3\u00f4\u00f5\u00f6\u00f9\u00fa\u00fb\u00fc\u00ff\u0101\u0105\u0107\u0113\u0119\u011b\u012b\u0131\u0142\u0144\u014d\u0151\u0153\u015b\u016b\u0171\u017a\u017c\u01ce\u01d0\u01d2\u01d4\u0430\u0431\u0432\u0433\u0434\u0435\u0436\u0437\u0438\u0439\u043a\u043b\u043c\u043d\u043e\u043f\u0440\u0441\u0442\u0443\u0444\u0445\u0446\u0447\u0448\u0449\u044a\u044b\u044c\u044d\u044e\u044f\u0451\u0454\u0456\u0457\u0491\u2013!'(),-.:;? ",
        "punctuations": "!'(),-.:;? ",
        "phonemes": "iy\u0268\u0289\u026fu\u026a\u028f\u028ae\u00f8\u0258\u0259\u0275\u0264o\u025b\u0153\u025c\u025e\u028c\u0254\u00e6\u0250a\u0276\u0251\u0252\u1d7b\u0298\u0253\u01c0\u0257\u01c3\u0284\u01c2\u0260\u01c1\u029bpbtd\u0288\u0256c\u025fk\u0261q\u0262\u0294\u0274\u014b\u0272\u0273n\u0271m\u0299r\u0280\u2c71\u027e\u027d\u0278\u03b2fv\u03b8\u00f0sz\u0283\u0292\u0282\u0290\u00e7\u029dx\u0263\u03c7\u0281\u0127\u0295h\u0266\u026c\u026e\u028b\u0279\u027bj\u0270l\u026d\u028e\u029f\u02c8\u02cc\u02d0\u02d1\u028dw\u0265\u029c\u02a2\u02a1\u0255\u0291\u027a\u0267\u025a\u02de\u026b'\u0303' ",
        "unique": true
    },
    "batch_group_size": 0,
    "loss_masking": null,
    "min_seq_len": 90,
    "max_seq_len": 270,
    "compute_f0": false,
    "compute_linear_spec": true,
    "add_blank": true,
    "datasets": [
        {
            "name": "vctk",
            "path": "./datasets/VCTK/",
            "meta_file_train": null,
            "ununsed_speakers": [
                "p225",
                "p234",
                "p238",
                "p245",
                "p248",
                "p261",
                "p294",
                "p302",
                "p326",
                "p335",
                "p347"
            ],
            "language": "en",
            "meta_file_val": null,
            "meta_file_attn_mask": ""
        },
        {
            "name": "brspeech",
            "path": "./datasets/TTS-Portuguese-Corpus_16khz/",
            "meta_file_train": "train_TTS-Portuguese_Corpus_metadata.csv",
            "ununsed_speakers": null,
            "language": "pt-br",
            "meta_file_val": "eval_TTS-Portuguese_Corpus_metadata.csv",
            "meta_file_attn_mask": ""
        },
        {
            "name": "mailabs",
            "path": "./datasets/mailabs/fr_FR",
            "meta_file_train": null,
            "ununsed_speakers": null,
            "language": "fr-fr",
            "meta_file_val": null,
            "meta_file_attn_mask": null
        },
        {
            "name": "mailabs",
            "path": "./datasets/mailabs/it_IT",
            "meta_file_train": null,
            "ununsed_speakers": null,
            "language": "it-it",
            "meta_file_val": null,
            "meta_file_attn_mask": null
        }
    ],
    "optimizer": "AdamW",
    "optimizer_params": {
        "betas": [
            0.8,
            0.99
        ],
        "eps": 1e-09,
        "weight_decay": 0.01
    },
    "lr_scheduler": "",
    "lr_scheduler_params": null,
    "test_sentences": [
    ],
    "use_speaker_embedding": false,
    "use_d_vector_file": true,
    "d_vector_dim": 512,
    
    "language_ids":   {
            "en": 0,
            "fr-fr": 1,
            "pt-br": 2,
            "it-it": 3

    },

    "model_args": {
        "num_chars": 165,
        "out_channels": 513,
        "spec_segment_size": 62,
        "hidden_channels": 192,
        "hidden_channels_ffn_text_encoder": 768,
        "num_heads_text_encoder": 2,
        "num_layers_text_encoder": 10,
        "kernel_size_text_encoder": 3,
        "dropout_p_text_encoder": 0.1,
        "dropout_p_duration_predictor": 0.5,
        "kernel_size_posterior_encoder": 5,
        "dilation_rate_posterior_encoder": 1,
        "num_layers_posterior_encoder": 16,
        "kernel_size_flow": 5,
        "dilation_rate_flow": 1,
        "num_layers_flow": 4,
        "resblock_type_decoder": "2",
        "resblock_kernel_sizes_decoder": [
            3,
            7,
            11
        ],
        "resblock_dilation_sizes_decoder": [
            [
                1,
                3,
                5
            ],
            [
                1,
                3,
                5
            ],
            [
                1,
                3,
                5
            ]
        ],
        "upsample_rates_decoder": [
            8,
            8,
            2,
            2
        ],
        "upsample_initial_channel_decoder": 512,
        "upsample_kernel_sizes_decoder": [
            16,
            16,
            4,
            4
        ],
        "use_sdp": true,
        "noise_scale": 1.0,
        "inference_noise_scale": 0.3,
        "length_scale": 1.5,
        "noise_scale_dp": 0.6,
        "inference_noise_scale_dp": 0.3,
        "max_inference_len": null,
        "init_discriminator": true,
        "use_spectral_norm_disriminator": false,
        "use_speaker_embedding": false,
        "num_speakers": 1244,
        "speakers_file": null,
        "d_vector_file": "./datasets/alt_d_vector_file.json",
        "speaker_embedding_channels": 512,
        "use_d_vector_file": true,
        "d_vector_dim": 512,
        "detach_dp_input": true,
        "use_language_embedding": true,
        "embedded_language_dim": 4,
        "num_languages": 4,
        "use_speaker_encoder_as_loss": true,
        "speaker_encoder_config_path": "./datasets/config_se.json",
        "speaker_encoder_model_path": "./datasets/model_se.pth.tar",
        "fine_tuning_mode": 0,
        "freeze_encoder": false,
        "freeze_DP": false,
        "freeze_PE": false,
        "freeze_flow_decoder": false,
        "freeze_waveform_decoder": false
    },
    "grad_clip": [
        5.0,
        5.0
    ],
    "lr_gen": 0.0002,
    "lr_disc": 0.0002,
    "lr_scheduler_gen": "ExponentialLR",
    "lr_scheduler_gen_params": {
        "gamma": 0.999875,
        "last_epoch": -1
    },
    "lr_scheduler_disc": "ExponentialLR",
    "lr_scheduler_disc_params": {
        "gamma": 0.999875,
        "last_epoch": -1
    },
    "kl_loss_alpha": 1.0,
    "disc_loss_alpha": 1.0,
    "gen_loss_alpha": 1.0,
    "feat_loss_alpha": 1.0,
    "mel_loss_alpha": 45.0,
    "dur_loss_alpha": 1.0,
    "speaker_encoder_loss_alpha": 9.0,
    "return_wav": true,
    "r": 1

(I removed the test sentences, the rest is unchanged)

Expected behavior

Training starts normally

Environment

  • 🐸TTS Version (e.g., 1.3.0): 0.5.0 (just reinstalled)
  • PyTorch Version (e.g., 1.8) 1.9.1+cu111
  • Python version: 3.8.5
  • OS (e.g., Linux): Ubuntu
  • CUDA/cuDNN version: 11.1
  • GPU models and configuration: NVidia V100
  • How you installed PyTorch (conda, pip, source): pip
  • Any other relevant information:

Additional context

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:12

github_iconTop GitHub Comments

1reaction
e0xextazycommented, Mar 28, 2022

Hi! There are probably a lot of other people here that can offer better guidance, but I can share my experience. My config looks mostly the same (different batch_size, but should not matter). Maybe you should start from a different checkpoint? I did start from checkpoint 3 from the YourTTS repo.

попробуй поставить "num_languages": 1 и изменить "language_ids"

1reaction
fColangelocommented, Mar 28, 2022

Hi! There are probably a lot of other people here that can offer better guidance, but I can share my experience. My config looks mostly the same (different batch_size, but should not matter). Maybe you should start from a different checkpoint? I did start from checkpoint 3 from the YourTTS repo.

Read more comments on GitHub >

github_iconTop Results From Across the Web

num_samples should be a positive integer value, but got ...
The question means the number of train dataset is zero.Maybe you set wrong path, and you should check your code. 5
Read more >
ValueError: num_samples should be a positive integer value ...
I have the error of the title ValueError: num_samples should be a positive integer value, but got num_samples=0 because basically I am ...
Read more >
Num_samples should be a positive integer ... - PyTorch Forums
Hello everybody, I am new to PyTorch. I have a problem when I tried to train my data. When I run my program...
Read more >
num_samples should be a positive integeral value, but got ...
Has anyone encountered and solved the below error: Error: ValueError: num_samples should be a positive integeral value, but got num_samples= ...
Read more >
关于pytorch 加载数据集时报错:ValueError: num_samples ...
pytorch训练神经网络时,DataLoader加载数据集时,报错:ValueError: num_samples should be a positive integer value, but got num_samp=0.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found