Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug] ValueError: Cannot load file containing pickled data when allow_pickle=False

See original GitHub issue

Describe the bug

I had training tacotron 2 for a while and now I want to add sample audio for one speaker. When I run using

CUDA_VISIBLE_DEVICES=0 python train.py --continue_path /media/DATA-2/TTS/TTS_Coqui/TTS/running-July-28-2022_09+54AM-68cef28a

I got error like this:

 > Number of output frames: 2

 > EPOCH: 0/1000
 --> /media/DATA-2/TTS/TTS_Coqui/TTS-July-28-2022_09+54AM-68cef28a


> DataLoader initialization
| > Tokenizer:
	| > add_blank: False
	| > use_eos_bos: False
	| > use_phonemes: True
	| > phonemizer:
		| > phoneme language: en-us
		| > phoneme backend: gruut
| > Number of instances : 23359
 | > Preprocessing samples
 | > Max text length: 239
 | > Min text length: 4
 | > Avg text length: 86.08806027655294
 | 
 | > Max audio length: 1145718.0
 | > Min audio length: 11868.0
 | > Avg audio length: 519904.13767712656
 | > Num. instances discarded samples: 0
 | > Batch group size: 0.

 > TRAINING (2022-09-01 11:28:31) 
/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/models/tacotron2.py:333: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  ) // self.decoder.r
/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2228.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/models/tacotron2.py:335: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  alignment_lengths = mel_lengths // self.decoder.r

   --> STEP: 9/5840 -- GLOBAL_STEP: 1690010
     | > decoder_loss: 1.35190  (2.06165)
     | > postnet_loss: 1.23185  (1.89519)
     | > stopnet_loss: 0.45206  (0.54466)
     | > decoder_coarse_loss: 1.96557  (2.80050)
     | > decoder_ddc_loss: 0.05431  (0.06398)
     | > ga_loss: 0.00554  (0.01036)
     | > decoder_diff_spec_loss: 0.46238  (0.58947)
     | > postnet_diff_spec_loss: 0.40906  (0.52605)
     | > decoder_ssim_loss: 0.48877  (0.48201)
     | > postnet_ssim_loss: 0.45778  (0.45322)
     | > loss: 2.08516  (2.81450)
     | > align_error: 0.38218  (0.36455)
     | > grad_norm: 11.03733  (13.36171)
     | > current_lr: 0.00000 
     | > step_time: 0.16360  (0.17053)
     | > loader_time: 0.00130  (0.00129)


   --> STEP: 19/5840 -- GLOBAL_STEP: 1690020
     | > decoder_loss: 1.26435  (2.00329)
     | > postnet_loss: 1.14596  (1.83944)
     | > stopnet_loss: 0.15051  (0.49044)
     | > decoder_coarse_loss: 1.96471  (2.79364)
     | > decoder_ddc_loss: 0.03852  (0.05443)
     | > ga_loss: 0.00158  (0.00696)
     | > decoder_diff_spec_loss: 0.44740  (0.57787)
     | > postnet_diff_spec_loss: 0.39480  (0.51306)
     | > decoder_ssim_loss: 0.43631  (0.47875)
     | > postnet_ssim_loss: 0.40454  (0.44884)
     | > loss: 1.68256  (2.70255)
     | > align_error: 0.32000  (0.36616)
     | > grad_norm: 6.11971  (12.52853)
     | > current_lr: 0.00000 
     | > step_time: 0.22500  (0.19586)
     | > loader_time: 0.00150  (0.00125)

 ! Run is kept in /media/DATA-2/TTS/TTS_Coqui/TTS-July-28-2022_09+54AM-68cef28a
Traceback (most recent call last):
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1492, in fit
    self._fit()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1476, in _fit
    self.train_epoch()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1254, in train_epoch
    for cur_step, batch in enumerate(self.train_loader):
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
    return self._process_data(data)
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/_utils.py", line 457, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 1.
Original Traceback (most recent call last):
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 180, in __getitem__
    return self.load_data(idx)
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 230, in load_data
    token_ids = self.get_token_ids(idx, item["text"])
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 213, in get_token_ids
    token_ids = self.get_phonemes(idx, text)["token_ids"]
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 196, in get_phonemes
    out_dict = self.phoneme_dataset[idx]
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 563, in __getitem__
    ids = self.compute_or_load(item["audio_file"], item["text"])
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 579, in compute_or_load
    ids = np.load(cache_path)
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/numpy/lib/npyio.py", line 445, in load
    raise ValueError("Cannot load file containing pickled data "
ValueError: Cannot load file containing pickled data when allow_pickle=False

Environment

{
"CUDA": {
"GPU": [
"NVIDIA GeForce GTX 1660 Ti"
],
"available": true,
"version": "10.2"
},
"Packages": {
"PyTorch_debug": false,
"PyTorch_version": "1.11.0+cu102",
"TTS": "0.6.1",
"numpy": "1.19.5"
},
"System": {
"OS": "Linux",
"architecture": [
"64bit",
"ELF"
],
"processor": "x86_64",
"python": "3.8.0",
"version": "#118~18.04.1-Ubuntu SMP Thu Mar 3 13:53:15 UTC 2022"
}
}

Issue Analytics

State:
Created a year ago
Comments:6 (2 by maintainers)

Top GitHub Comments

1reaction

kin0303commented, Sep 9, 2022

I am done with this problem. If you meet this problem you can do this way:

Maybe move the cache folder temporary to different location and let it rebuild.
add allow_pickle=True in np.load(cache_path), like np.load(cache_path , allow_pickle=True) at /media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 579
Or you can read these issues https://github.com/coqui-ai/TTS/issues/1624

0reactions

kin0303commented, Sep 2, 2022

@blackmamba1122 Looks like some of the phonemes cached are corrupted. You need to delete the cache directory or change the phonemes cache directory (“phoneme_cache_path” parameter on config) forcing the TTS to recompute it.

I’ll try this one and I report it back

Still error

   --> STEP: 1209/5840 -- GLOBAL_STEP: 1691210
     | > decoder_loss: 0.58440  (0.80216)
     | > postnet_loss: 0.51266  (0.72178)
     | > stopnet_loss: 0.84992  (0.29996)
     | > decoder_coarse_loss: 0.89247  (1.24166)
     | > decoder_ddc_loss: 0.00162  (0.00863)
     | > ga_loss: 0.00004  (0.00034)
     | > decoder_diff_spec_loss: 0.37415  (0.39893)
     | > postnet_diff_spec_loss: 0.33291  (0.35312)
     | > decoder_ssim_loss: 0.12760  (0.25786)
     | > postnet_ssim_loss: 0.11682  (0.23833)
     | > loss: 1.58579  (1.30728)
     | > align_error: 0.60102  (0.41433)
     | > grad_norm: 4.16512  (4.11554)
     | > current_lr: 0.00000 
     | > step_time: 2.44260  (1.16248)
     | > loader_time: 0.00260  (0.00190)

 ! Run is kept in /media/DATA-2/TTS/TTS_Coqui/TTS-July-28-2022_09+54AM-68cef28a
Traceback (most recent call last):
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1492, in fit
    self._fit()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1476, in _fit
    self.train_epoch()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/trainer/trainer.py", line 1254, in train_epoch
    for cur_step, batch in enumerate(self.train_loader):
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 530, in __next__
    data = self._next_data()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1204, in _next_data
    return self._process_data(data)
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
    data.reraise()
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/_utils.py", line 457, in reraise
    raise exception
AssertionError: Caught AssertionError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/media/DATA-2/TTS/TTS_Coqui/coqui_env/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 180, in __getitem__
    return self.load_data(idx)
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 230, in load_data
    token_ids = self.get_token_ids(idx, item["text"])
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 213, in get_token_ids
    token_ids = self.get_phonemes(idx, text)["token_ids"]
  File "/media/DATA-2/TTS/TTS_Coqui/TTS/TTS/tts/datasets/dataset.py", line 198, in get_phonemes
    assert len(out_dict["token_ids"]) > 0
AssertionError

Top Results From Across the Web

Cannot load file containing pickled data - Python .npy I/O

Warning: Loading files that contain object arrays uses the pickle module, which is not secure against erroneous or maliciously constructed data.

Cannot load file containing pickled data when allow_pickle ...

When i try to run the following code, it gives me this error. 'ValueError: Cannot load file containing pickled data when allow_pickle=False'.

ValueError: Cannot load file containing pickled data when ...

This appears to be a bug with the 3DFSC job, and we were able to reproduce the error. Note that the error is...

Cannot load file containing pickled data - Python .npy I/O-numpy

Warning: Loading files that contain object arrays uses the pickle module, which is not secure against erroneous or maliciously constructed data. Consider ...

numpy.load — NumPy v1.25.dev0 Manual

If allow_pickle=True , but the file cannot be loaded as a pickle. ValueError. The file contains an object array, but allow_pickle=False given. See...