Correct way to send AsyncReplayBuffers to new processes
See original GitHub issueI’m trying to do multi-GPU DistributedDataParallel training using an AsyncPrioritizedSequenceReplayFrameBuffer, and I’m having trouble passing a buffer created in the parent to child processes. When I directly pass the buffer at initialization I receive the error
File "/home/schwarzm/anaconda3/envs/pytorch/lib/python3.7/multiprocessing/spawn.py", line 115, in _main
self = reduction.pickle.load(from_parent)
_pickle.UnpicklingError: NEWOBJ class argument isn't a type object
When I wrap the buffer in a cloudpickle wrapper before passing it, I instead see TypeError: can't pickle mmap.mmap objects
.
From what I can tell, the function launch_optimizer_workers
is directly passing such a buffer to it the created worker processes via self.algo
– is there some trick necessary to make this work? Does each worker need to create a new copy of the buffer at process startup? Launching workers with fork
appears to avoid this issue, but it in turn makes it impossible to use CUDA in the child processes.
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (5 by maintainers)
Top GitHub Comments
yeah i’d like to, might take sometime cuz i wanna let this extra environment idea lend first 😄
cool! it was really helpful.