Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TypeError: can't pickle Environment objects when num_workers > 0 for LSUN

See original GitHub issue

The program fails to create an iterator for a DataLoader object when the used dataset is LSUN and the amount of workers is greater than zero. I do not have such an error when work with other datasets. Something tells me that the issue might be caused by lmdb. I run on Windows 10, CUDA 10.

Code:

import torch.utils.data
import torchvision.datasets as dset
import torchvision.transforms as transforms

dataset = dset.LSUN(root='D:/bedroom_train_lmdb', classes=['bedroom_train'],
                            transform=transforms.Compose([
                                transforms.Resize((64, 64)),
                                transforms.ToTensor(),
                                transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
                            ]))

dataloader = torch.utils.data.DataLoader(dataset, batch_size=128,
                                             shuffle=True, num_workers=4)

for data in dataloader:
    print(data)

Error:

Traceback (most recent call last):
  File "C:/Users/x/.PyCharm2018.3/config/scratches/scratch.py", line 15, in <module>
    for data in dataloader:
  File "C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 819, in __iter__
    return _DataLoaderIter(self)
  File "C:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 560, in __init__
    w.start()
  File "C:\Anaconda3\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Anaconda3\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Anaconda3\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Anaconda3\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Anaconda3\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle Environment objects

Issue Analytics

State:
Created 5 years ago
Comments:15 (3 by maintainers)

Top GitHub Comments

36reactions

airsplaycommented, Feb 28, 2021

A possible solution is similar to the one for HDF5:

Do not open lmdb inside __init__
Open the lmdb at the first data iteration.

Here is an illustration:

class DataLoader(torch.utils.data.Dataset):
    def __init__(self):
        """do not open lmdb here!!"""

    def open_lmdb(self):
         self.env = lmdb.open(self.lmdb_dir, readonly=True, create=False)
         self.txn = self.env.begin(buffers=True)

    def __getitem__(self, item: int):
        if not hasattr(self, 'txn'):
            self.open_lmdb()
        """
        Then do anything you want with env/txn here.
        """

Explanation The multi-processing actually happens when you create the data iterator (e.g., when calling for datum in dataloader:): https://github.com/pytorch/pytorch/blob/461014d54b3981c8fa6617f90ff7b7df51ab1e85/torch/utils/data/dataloader.py#L712-L720 In short, it would create multiple processes which “copy” the state of the current process. This copy involves a pickle of the LMDB’s Env thus causes an issue. In our solution, we open it at the first data iteration and the opened lmdb file object would be dedicated to each subprocess.

14reactions

Santiago810commented, Feb 6, 2020

this issue also appear in linux, the reason is the opened lmdb env can not be pickled