question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ConnectionResetError with multiprocessing in A2C

See original GitHub issue

I’ve modified train.py in A2C for cartpole in Gym, and I’m running into a ConnectionResetError while testing with two processes. I’m using python 3.5, gym 0.9.2, and tensorflow-cpu 1.3.0 on Ubuntu 14.04.

Here is the relevant portion of my version of train.py:

ncpu = num_processes
config = tf.ConfigProto(allow_soft_placement=True, intra_op_parallelism_threads=ncpu, inter_op_parallelism_threads=ncpu)
tf.Session(config=config).__enter__()
set_global_seeds(seed)

def make_env(rank):
    env = gym.make(env_id)
    env.seed(seed + rank)
    if logger.get_dir():
        env = bench.Monitor(env, os.path.join(logger.get_dir(), 'train-{}.monitor.json'.format(rank)))
    return env

env = SubprocVecEnv([make_env(i) for i in range(ncpu)])
env = VecNormalize(env)

and here is the error I get when num_processes = 2:

Process Process-1:
Traceback (most recent call last):
  File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/katelyng/baselines/baselines/common/vec_env/subproc_vec_env.py", line 8, in worker
    env = env_fn_wrapper.x()
TypeError: 'Monitor' object is not callable
Process Process-2:
Traceback (most recent call last):
  File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/katelyng/baselines/baselines/common/vec_env/subproc_vec_env.py", line 8, in worker
    env = env_fn_wrapper.x()
TypeError: 'Monitor' object is not callable
Traceback (most recent call last):
  File "/home/katelyng/anaconda3/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/home/katelyng/anaconda3/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/katelyng/AdaptiveRL/rl-environments/examples/a2c_baselines/train.py", line 116, in <module>
    main()
  File "/home/katelyng/AdaptiveRL/rl-environments/examples/a2c_baselines/train.py", line 111, in main
    seed=args.seed,
  File "/home/katelyng/AdaptiveRL/rl-environments/examples/a2c_baselines/train.py", line 44, in train
    env = SubprocVecEnv([make_env(i) for i in range(ncpu)])
  File "/home/katelyng/baselines/baselines/common/vec_env/subproc_vec_env.py", line 49, in __init__
    observation_space, action_space = self.remotes[0].recv()
  File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/connection.py", line 250, in recv
    buf = self._recv_bytes()
  File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/home/katelyng/anaconda3/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer

I would appreciate any help with debugging this.

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:6

github_iconTop GitHub Comments

2reactions
yukang2017commented, Aug 8, 2018

Hi, @kxgao Would you please tell me more detail about “base.make_env” implementation? I meet the same issue. Thank you inadvance.

Best~ Yukang

2reactions
wetliucommented, Apr 13, 2018

Hi. I have the same problem as you do. Would you mind posting your solution here? Thanks.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Python's multiprocessing manager dict connection error
I am trying to share a dictionary with multiprocessing. The problem is similar to these: python multiprocessing manager - shared list - ...
Read more >
How do you implement multiprocessing - Reddit
I have implemented a sequential version of A2C, but now I want to distribute the actors on different cores to have an efficient...
Read more >
A2C mutliprocessing w/ ROS Gazebo - Bountysource
I am trying to run A2C method on multiple Gazebo environments created by ROS. Before creating multiple envs, I first created only one...
Read more >
ConnectionResetError when using multiprocessing with more ...
ConnectionResetError : [Errno 104] Connection reset by peer. This does not happen if I set cores=1. I am quite new to pymc3 and...
Read more >
Examples — Stable Baselines 2.10.3a0 documentation
Here we are also multiprocessing training (num_env=4 => 4 processes) env ... DummyVecEnv from stable_baselines import A2C # Custom MLP policy of three ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found