ConnectionResetError with multiprocessing in A2C
See original GitHub issueI’ve modified train.py in A2C for cartpole in Gym, and I’m running into a ConnectionResetError while testing with two processes. I’m using python 3.5, gym 0.9.2, and tensorflow-cpu 1.3.0 on Ubuntu 14.04.
Here is the relevant portion of my version of train.py:
ncpu = num_processes
config = tf.ConfigProto(allow_soft_placement=True, intra_op_parallelism_threads=ncpu, inter_op_parallelism_threads=ncpu)
tf.Session(config=config).__enter__()
set_global_seeds(seed)
def make_env(rank):
env = gym.make(env_id)
env.seed(seed + rank)
if logger.get_dir():
env = bench.Monitor(env, os.path.join(logger.get_dir(), 'train-{}.monitor.json'.format(rank)))
return env
env = SubprocVecEnv([make_env(i) for i in range(ncpu)])
env = VecNormalize(env)
and here is the error I get when num_processes = 2:
Process Process-1:
Traceback (most recent call last):
File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/katelyng/baselines/baselines/common/vec_env/subproc_vec_env.py", line 8, in worker
env = env_fn_wrapper.x()
TypeError: 'Monitor' object is not callable
Process Process-2:
Traceback (most recent call last):
File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/katelyng/baselines/baselines/common/vec_env/subproc_vec_env.py", line 8, in worker
env = env_fn_wrapper.x()
TypeError: 'Monitor' object is not callable
Traceback (most recent call last):
File "/home/katelyng/anaconda3/lib/python3.5/runpy.py", line 184, in _run_module_as_main
"__main__", mod_spec)
File "/home/katelyng/anaconda3/lib/python3.5/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/katelyng/AdaptiveRL/rl-environments/examples/a2c_baselines/train.py", line 116, in <module>
main()
File "/home/katelyng/AdaptiveRL/rl-environments/examples/a2c_baselines/train.py", line 111, in main
seed=args.seed,
File "/home/katelyng/AdaptiveRL/rl-environments/examples/a2c_baselines/train.py", line 44, in train
env = SubprocVecEnv([make_env(i) for i in range(ncpu)])
File "/home/katelyng/baselines/baselines/common/vec_env/subproc_vec_env.py", line 49, in __init__
observation_space, action_space = self.remotes[0].recv()
File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
I would appreciate any help with debugging this.
Issue Analytics
- State:
- Created 5 years ago
- Comments:6
Top Results From Across the Web
Python's multiprocessing manager dict connection error
I am trying to share a dictionary with multiprocessing. The problem is similar to these: python multiprocessing manager - shared list - ...
Read more >How do you implement multiprocessing - Reddit
I have implemented a sequential version of A2C, but now I want to distribute the actors on different cores to have an efficient...
Read more >A2C mutliprocessing w/ ROS Gazebo - Bountysource
I am trying to run A2C method on multiple Gazebo environments created by ROS. Before creating multiple envs, I first created only one...
Read more >ConnectionResetError when using multiprocessing with more ...
ConnectionResetError : [Errno 104] Connection reset by peer. This does not happen if I set cores=1. I am quite new to pymc3 and...
Read more >Examples — Stable Baselines 2.10.3a0 documentation
Here we are also multiprocessing training (num_env=4 => 4 processes) env ... DummyVecEnv from stable_baselines import A2C # Custom MLP policy of three ......
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi, @kxgao Would you please tell me more detail about “base.make_env” implementation? I meet the same issue. Thank you inadvance.
Best~ Yukang
Hi. I have the same problem as you do. Would you mind posting your solution here? Thanks.