Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

multiprocessing issues on Windows

See original GitHub issue

Describe the bug

DataLoader calls fail on Windows when num_workers > 0. Windows causes certain multiprocessing workflows to fail due to pickling issues.

To Reproduce Steps to reproduce the behavior:

Run https://github.com/Project-MONAI/tutorials/blob/master/2d_classification/mednist_tutorial.ipynb on Windows
Get the following error:

---------------------------------------------------------------------------
BrokenPipeError                           Traceback (most recent call last)
<ipython-input-11-390aa9a04062> in <module>
     10     epoch_loss = 0
     11     step = 0
---> 12     for batch_data in train_loader:
     13         step += 1
     14         inputs, labels = batch_data[0].to(device), batch_data[1].to(device)

c:\src\venv-3.7.9-prebuilt\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
    350             return self._iterator
    351         else:
--> 352             return self._get_iterator()
    353 
    354     @property

c:\src\venv-3.7.9-prebuilt\lib\site-packages\torch\utils\data\dataloader.py in _get_iterator(self)
    292             return _SingleProcessDataLoaderIter(self)
    293         else:
--> 294             return _MultiProcessingDataLoaderIter(self)
    295 
    296     @property

c:\src\venv-3.7.9-prebuilt\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
    799             #     before it starts, and __del__ tries to join but will get:
    800             #     AssertionError: can only join a started process.
--> 801             w.start()
    802             self._index_queues.append(index_queue)
    803             self._workers.append(w)

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\process.py in start(self)
    110                'daemonic processes are not allowed to have children'
    111         _cleanup()
--> 112         self._popen = self._Popen(self)
    113         self._sentinel = self._popen.sentinel
    114         # Avoid a refcycle if the target function holds an indirect

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    221     @staticmethod
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 
    225 class DefaultContext(BaseContext):

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 
    324     class SpawnContext(BaseContext):

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     87             try:
     88                 reduction.dump(prep_data, to_child)
---> 89                 reduction.dump(process_obj, to_child)
     90             finally:
     91                 set_spawning_popen(None)

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

BrokenPipeError: [Errno 32] Broken pipe

Expected behavior num_workers > 0 should work on Windows.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information; e.g. using sh runtests.sh -v): MONAI version: 0.3.0 Python version: 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)] OS version: Windows (10) Numpy version: 1.19.3 Pytorch version: 1.7.0 MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies: Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION. Nibabel version: 3.2.0 scikit-image version: 0.17.2 Pillow version: 8.0.1 Tensorboard version: 2.4.0 gdown version: 3.12.2 TorchVision version: 0.8.1 ITK version: 5.1.1 tqdm version: 4.51.0

Additional context I’m not an expert on Python multiprocessing but it looks like using the pathos.multiprocessing package, or process Manager in multiprocessing, or somehow ensuring all arguments passed to workers are pickable (e.g., no lambda functions).

Issue Analytics

State:
Created 3 years ago
Comments:5 (3 by maintainers)

Top GitHub Comments

1reaction

aarponcommented, Nov 30, 2020

This alternative (Fast)DataLoader (completely) fixes the slowness issue for me in Windows: pytorch/pytorch#15849 (comment).

0reactions

ericspodcommented, Nov 23, 2020

Pytorch itself doesn’t use transforms, that pattern is in torchvision, so they wouldn’t necessarily discuss the issue there. The problem is that the transform objects are not picklable so they can’t be sent to the subprocesses. If you use Pytorch’s dataloaders natively without augmentations it can work just fine with multiple processes since they can ensure the objects involved are picklable. One issue MONAI has is that the random state objects are not picklable, but I’m sure there’s other issues that would need to be resolved if possible.

Top Results From Across the Web

Python Multiprocessing Not Working on Windows 10

But I have encountered a strange behavior when running the code. However, when I run the code with Command Prompt, it skips the...

Differences between multiprocessing on Windows and Linux

Multiprocessing behaves very differently on Windows and Linux. Learn the differences to prevent mistakes.

Very first multiprocessing example not working on Windows 11

msg407186 ‑ (view) Author: Chang Zhou (quattrozhou) Date: 2021‑11‑28 04:37 msg407189 ‑ (view) Author: Eryk Sun (eryksun) * Date: 2021‑11‑28 05:27 msg407638 ‑ (view) Author:...

Python Multiprocessing on Windows and Linux - YouTube

Normally, code written in Python can be executed both on Linux and Windows without mayor changes. Except for multiprocessing.

Add if __name__ == '__main__' When Spawning Processes

June 1, 2022 by Jason Brownlee in Multiprocessing ... discover the RuntimeError common on Windows and MacOS when creating child processes ...