question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

multiprocessing issues on Windows

See original GitHub issue

Describe the bug

DataLoader calls fail on Windows when num_workers > 0. Windows causes certain multiprocessing workflows to fail due to pickling issues.

To Reproduce Steps to reproduce the behavior:

  1. Run https://github.com/Project-MONAI/tutorials/blob/master/2d_classification/mednist_tutorial.ipynb on Windows
  2. Get the following error:
---------------------------------------------------------------------------
BrokenPipeError                           Traceback (most recent call last)
<ipython-input-11-390aa9a04062> in <module>
     10     epoch_loss = 0
     11     step = 0
---> 12     for batch_data in train_loader:
     13         step += 1
     14         inputs, labels = batch_data[0].to(device), batch_data[1].to(device)

c:\src\venv-3.7.9-prebuilt\lib\site-packages\torch\utils\data\dataloader.py in __iter__(self)
    350             return self._iterator
    351         else:
--> 352             return self._get_iterator()
    353 
    354     @property

c:\src\venv-3.7.9-prebuilt\lib\site-packages\torch\utils\data\dataloader.py in _get_iterator(self)
    292             return _SingleProcessDataLoaderIter(self)
    293         else:
--> 294             return _MultiProcessingDataLoaderIter(self)
    295 
    296     @property

c:\src\venv-3.7.9-prebuilt\lib\site-packages\torch\utils\data\dataloader.py in __init__(self, loader)
    799             #     before it starts, and __del__ tries to join but will get:
    800             #     AssertionError: can only join a started process.
--> 801             w.start()
    802             self._index_queues.append(index_queue)
    803             self._workers.append(w)

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\process.py in start(self)
    110                'daemonic processes are not allowed to have children'
    111         _cleanup()
--> 112         self._popen = self._Popen(self)
    113         self._sentinel = self._popen.sentinel
    114         # Avoid a refcycle if the target function holds an indirect

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    221     @staticmethod
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 
    225 class DefaultContext(BaseContext):

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 
    324     class SpawnContext(BaseContext):

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     87             try:
     88                 reduction.dump(prep_data, to_child)
---> 89                 reduction.dump(process_obj, to_child)
     90             finally:
     91                 set_spawning_popen(None)

~\.pyenv\pyenv-win\versions\3.7.9\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

BrokenPipeError: [Errno 32] Broken pipe

Expected behavior num_workers > 0 should work on Windows.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information; e.g. using sh runtests.sh -v): MONAI version: 0.3.0 Python version: 3.7.9 (tags/v3.7.9:13c94747c7, Aug 17 2020, 18:58:18) [MSC v.1900 64 bit (AMD64)] OS version: Windows (10) Numpy version: 1.19.3 Pytorch version: 1.7.0 MONAI flags: HAS_EXT = False, USE_COMPILED = False

Optional dependencies: Pytorch Ignite version: NOT INSTALLED or UNKNOWN VERSION. Nibabel version: 3.2.0 scikit-image version: 0.17.2 Pillow version: 8.0.1 Tensorboard version: 2.4.0 gdown version: 3.12.2 TorchVision version: 0.8.1 ITK version: 5.1.1 tqdm version: 4.51.0

Additional context I’m not an expert on Python multiprocessing but it looks like using the pathos.multiprocessing package, or process Manager in multiprocessing, or somehow ensuring all arguments passed to workers are pickable (e.g., no lambda functions).

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:5 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
aarponcommented, Nov 30, 2020

This alternative (Fast)DataLoader (completely) fixes the slowness issue for me in Windows: pytorch/pytorch#15849 (comment).

0reactions
ericspodcommented, Nov 23, 2020

Pytorch itself doesn’t use transforms, that pattern is in torchvision, so they wouldn’t necessarily discuss the issue there. The problem is that the transform objects are not picklable so they can’t be sent to the subprocesses. If you use Pytorch’s dataloaders natively without augmentations it can work just fine with multiple processes since they can ensure the objects involved are picklable. One issue MONAI has is that the random state objects are not picklable, but I’m sure there’s other issues that would need to be resolved if possible.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python Multiprocessing Not Working on Windows 10
But I have encountered a strange behavior when running the code. However, when I run the code with Command Prompt, it skips the...
Read more >
Differences between multiprocessing on Windows and Linux
Multiprocessing behaves very differently on Windows and Linux. Learn the differences to prevent mistakes.
Read more >
Very first multiprocessing example not working on Windows 11
msg407186 ‑ (view) Author: Chang Zhou (quattrozhou) Date: 2021‑11‑28 04:37 msg407189 ‑ (view) Author: Eryk Sun (eryksun) * Date: 2021‑11‑28 05:27 msg407638 ‑ (view) Author:...
Read more >
Python Multiprocessing on Windows and Linux - YouTube
Normally, code written in Python can be executed both on Linux and Windows without mayor changes. Except for multiprocessing.
Read more >
Add if __name__ == '__main__' When Spawning Processes
June 1, 2022 by Jason Brownlee in Multiprocessing ... discover the RuntimeError common on Windows and MacOS when creating child processes ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found