Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ljspeech fast_speech recipe for en_US Fails With assert len(out_dict["token_ids"]) > 0

See original GitHub issue

Describe the bug

I am training fast_speech on the ljspeech en_US set using this recipe (on dev branch).

The ljspeech is merged into a single set by using these commands as described here:

cat ../by_book/*/*/*/metadata.csv >> metadata.csv
mkdir wavs
cp ../by_book/*/*/*/wavs/* wavs/

I am getting this exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1492, in fit
    self._fit()
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1476, in _fit
    self.train_epoch()
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1254, in train_epoch
    for cur_step, batch in enumerate(self.train_loader):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 457, in reraise
    raise exception
AssertionError: Caught AssertionError in DataLoader worker process 4.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 180, in __getitem__
    return self.load_data(idx)
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 230, in load_data
    token_ids = self.get_token_ids(idx, item["text"])
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 213, in get_token_ids
    token_ids = self.get_phonemes(idx, text)["token_ids"]
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 198, in get_phonemes
    assert len(out_dict["token_ids"]) > 0
AssertionError

To Reproduce

Get the data:

cd /data/
wget https://data.solak.de/data/Training/stt_tts/en_US.tgz
tar zxf en_US.tgz
rm en_US.tgz
cd en_US/
mkdir ljspeech
cd ljspeech/
cat ../by_book/*/*/*/metadata.csv >> metadata.csv
mkdir wavs
cp ../by_book/*/*/*/wavs/* wavs/

(if the cp throws too many arguments, then just run it by speaker and by book if necessary: cp ../by_book/female/*/*/wavs/* wavs/)

Then check if the files exist with python script:

import os
with open('metadata.csv') as f:
    for line in f:
        file_name = line.split('|')[0]
        file_exists = os.path.exists('wavs/'+file_name+'.wav')
        if not file_exists:
            print(file_name)

Manually remove non-existing files from the metadata.csv (there are three missing files at this time).

Point train_fast_speech.py to the data: path="/data/en_US/ljspeech/"

Change the sample rate: sample_rate=16000

Run the code:

cd TTS
export CUDA_VISIBLE_DEVICES=4
python3 recipes/ljspeech/fast_speech/train_fast_speech.py

After Pre-computing phonemes completes, it fails to train: [1m --> STEP: 0/1433 -- GLOBAL_STEP: 0

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1492, in fit
    self._fit()
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1476, in _fit
    self.train_epoch()
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1254, in train_epoch
    for cur_step, batch in enumerate(self.train_loader):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 457, in reraise
    raise exception
AssertionError: Caught AssertionError in DataLoader worker process 4.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 180, in __getitem__
    return self.load_data(idx)
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 230, in load_data
    token_ids = self.get_token_ids(idx, item["text"])
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 213, in get_token_ids
    token_ids = self.get_phonemes(idx, text)["token_ids"]
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 198, in get_phonemes
    assert len(out_dict["token_ids"]) > 0
AssertionError

Expected behavior

No exception.

Logs

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1492, in fit
    self._fit()
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1476, in _fit
    self.train_epoch()
  File "/usr/local/lib/python3.7/dist-packages/trainer/trainer.py", line 1254, in train_epoch
    for cur_step, batch in enumerate(self.train_loader):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 457, in reraise
    raise exception
AssertionError: Caught AssertionError in DataLoader worker process 4.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 180, in __getitem__
    return self.load_data(idx)
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 230, in load_data
    token_ids = self.get_token_ids(idx, item["text"])
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 213, in get_token_ids
    token_ids = self.get_phonemes(idx, text)["token_ids"]
  File "/apps/tts/TTS/TTS/tts/datasets/dataset.py", line 198, in get_phonemes
    assert len(out_dict["token_ids"]) > 0
AssertionError

Environment

{
    "CUDA": {
        "GPU": [
            "NVIDIA A100-SXM4-40GB"
        ],
        "available": true,
        "version": "11.5"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "1.11.0+cu115",
        "TTS": "0.6.2",
        "numpy": "1.21.6"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.7.13",
        "version": "#61~18.04.3-Ubuntu SMP Fri Oct 1 14:04:01 UTC 2021"
    }
}

Additional context

No response

Issue Analytics

State:
Created a year ago
Comments:9 (7 by maintainers)

Top GitHub Comments

2reactions

iprovalocommented, Jun 9, 2022

@p0p4k thank you for the pointer!

I added this code:

     if len(ids) == 0:
          print("Item must be removed from metadata.csv " + str(item) + " => " + str(ids))

to the __getitem__ right after

     ids = self.compute_or_load(item["audio_file"], item["text"])

per your suggestion and found the item causing the issue (after the tokenization step, it was just whitespaces). After removing it manually from metadata.csv, it is working again.

2reactions

p0p4kcommented, Jun 8, 2022

Check if the phonemes cache folder actually exists first. Then in the dataset.py file, you can try to debug by printing out the “ids” the __getitem__ method or the compute_or_load method of PhonemeDataset class. Report if the ids are printed or not. You can make the debug quicker by using a temp_metadata.txt file with just a few wav|text lines (~5-10 lines). Try it and report back, good luck.