Pybullet SubprocVecEnv Multiprocessing leads to Broken Pipe Error
See original GitHub issueDear All,
I am new to SB3. I have been able to run few basic codes. Recently, I trained KukaDiverseObjectEnv successfully with SAC algorithm. Now I want to run multiple environments using SubProcVecEnv. It did not work for me. I want to share my experience here and hopefully, someone will be able to help me in fixing this bug. I am running all these codes on Google Colab.
I am using a Gym Wrapper to convert the observation images from channel-last format to channel-first format and normalize the pixels between 0 and 1. My wrapper is as follows:
import gym
import numpy as np
class NormalizeObsvnWrapper(gym.Wrapper):
"""
:param env: (gym.Env) Gym environment that will be wrapped
"""
def __init__(self, env):
assert isinstance(env.observation_space, gym.spaces.Box),\
"Valid for continuous observation spaces of type gym.spaces.Box"
self._height = env.observation_space.shape[0]
self._width = env.observation_space.shape[1]
self._channels = env.observation_space.shape[2]
env.observation_space = gym.spaces.Box(low=0, high=255,
shape=(self._channels,
self._height,
self._width))
env.reward_range = (-np.inf, np.inf)
# call the parent constructor so that we can access self.env
super(NormalizeObsvnWrapper, self).__init__(env)
def _modify_obsvn(self, obs):
new_obs = np.transpose(obs, (2, 0, 1))
new_obs = np.asarray(new_obs, dtype=np.float32) / 255.0
return new_obs
def reset(self):
"""
Convert Images from HxWxC format to CxHxW
Normalize the pixels between 0 and 1.0
"""
return self._modify_obsvn(self.env.reset())
def step(self, action):
obs, reward, done, info = self.env.step(action)
new_obs = self._modify_obsvn(obs)
info['channel_first'] = True
info['nomalize pixel'] = True
return new_obs, reward, done, info
I also use a customCNN
network to extract features from the input images.
import gym
import torch as th
from stable_baselines3.common.torch_layers import BaseFeaturesExtractor
class CustomCNN(BaseFeaturesExtractor):
"""
:param observation_space: (gym.space)
:param features_dim: (int) number of features extracted. This corresponds to
the number of unit for the last layer
"""
def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 256):
super(CustomCNN, self).__init__(observation_space, features_dim)
# We assume CxHxW images (channels first format)
n_input_channels = observation_space.shape[0]
self.cnn = th.nn.Sequential(
th.nn.Conv2d(n_input_channels, 16, kernel_size=2, stride=2, padding=0),
th.nn.ReLU(),
th.nn.BatchNorm2d(16),
th.nn.Conv2d(16, 32, kernel_size=2, stride=2, padding=0),
th.nn.ReLU(),
th.nn.BatchNorm2d(32),
th.nn.Conv2d(32, 32, kernel_size=2, stride=2, padding=0),
th.nn.ReLU(),
th.nn.BatchNorm2d(32),
th.nn.Flatten()
)
# compute shape by doing one forward pass
with th.no_grad():
n_flatten = self.cnn(
th.as_tensor(observation_space.sample()[None]).float()
).shape[1]
self.linear = th.nn.Sequential(
th.nn.Linear(n_flatten, 128),
th.nn.ReLU(),
th.nn.Linear(128, 128),
th.nn.ReLU(),
th.nn.Linear(128, features_dim),
th.nn.ReLU()
)
def forward(self, observations: th.Tensor) -> th.Tensor:
return self.linear(self.cnn(observations))
Now I can train a single environment as follows:
p.connect(p.DIRECT)
# create the environment
env = NormalizeObsvnWrapper(KukaDiverseObjectEnv(maxSteps=20, isDiscrete=False, renders=False,
removeHeightHack=False))
env = Monitor(env, monitor_path)
policy_kwargs = dict(
features_extractor_class = CustomCNN,
features_extractor_kwargs = dict(features_dim=64),
net_arch = dict(qf=[128, 64, 32], pi=[128, 64, 64])
)
# create RL model
model = SAC('CnnPolicy', env, buffer_size=100000, batch_size=256,
policy_kwargs=policy_kwargs, tensorboard_log=tb_log_path)
# train the model: 50K time steps is adequate
%time model.learn(total_timesteps=50000, log_interval=4, tb_log_name='kuka_sac')
p.disconnect(p.DIRECT)
This is how the training looks like on tensorboard:
Now I want to run 2 environments simultaneously using SubprocVecEnv
. First I define the make_env
function as follows:
import gym
from typing import Callable
from stable_baselines3.common.utils import set_random_seed
def make_env(env_id, rank: int, seed: int = 0) -> Callable:
"""
Utility function for multiprocessed env
:param env: (gym.env) gym environment instance
"""
def _init() -> gym.Env:
if isinstance(env_id, gym.Env):
env = env_id
elif isinstance(str):
env = gym.make(env_id)
else:
raise ValueError('Invalid environment id')
env.seed(seed + rank)
return env
set_random_seed(seed)
return _init
My main training program is as follows:
import pybullet as p
import gym
import numpy as np
from datetime import datetime
from pybullet_envs.bullet.kuka_diverse_object_gym_env import KukaDiverseObjectEnv
from stable_baselines3 import SAC
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize, SubprocVecEnv
from stable_baselines3.common.env_util import make_vec_env
from stable_baselines3.common.monitor import Monitor
p.connect(p.DIRECT)
# Create an instance of the Gym Environment
env_id = NormalizeObsvnWrapper(KukaDiverseObjectEnv(maxSteps=20, isDiscrete=False, renders=False,
removeHeightHack=False))
# I also tried using without the wrapper without any success
# env_id = KukaDiverseObjectEnv(maxSteps=20, isDiscrete=False, renders=False,
# removeHeightHack=False)
# create vectorized environment
num_cpu = 2
env = SubprocVecEnv([make_env(env_id, i) for i in range(num_cpu)])
#env = DummyVecEnv([make_env(env_id, i) for i in range(num_cpu)]) # this does not work either
# Monitor the vectorized environment
env = Monitor(env, monitor_path)
# custom policy parameters
policy_kwargs = dict(
features_extractor_class = CustomCNN,
features_extractor_kwargs = dict(features_dim=64),
net_arch = dict(qf=[128, 64, 32], pi=[128, 64, 64])
)
# create RL model
model = SAC('CnnPolicy', env, buffer_size=70000, batch_size=256,
policy_kwargs=policy_kwargs, tensorboard_log=tb_log_path)
# train the model
begin = datetime.now()
model.learn(total_timesteps=50000, log_interval=4, tb_log_name='kuka_sac_mp'),
end = datetime.now()
print('Training time: ', end - begin)
p.disconnect(p.DIRECT)
I get the following error on Google Colab:
BrokenPipeError Traceback (most recent call last)
<ipython-input-11-1a3dc3ee7cba> in <module>()
25 # create vectorized environment
26 num_cpu = 2
---> 27 env = SubprocVecEnv([make_env(env_id, i) for i in range(num_cpu)])
28 #env = DummyVecEnv([make_env(env_id, i) for i in range(num_cpu)])
29
5 frames
/usr/lib/python3.7/multiprocessing/popen_forkserver.py in _launch(self, process_obj)
52 self.finalizer = util.Finalize(self, os.close, (self.sentinel,))
53 with open(w, 'wb', closefd=True) as f:
---> 54 f.write(buf.getbuffer())
55 self.pid = forkserver.read_signed(self.sentinel)
56
BrokenPipeError: [Errno 32] Broken pipe
I will greatly appreciate any help in this regard.
Thanks, Swagat
Issue Analytics
- State:
- Created 2 years ago
- Comments:10 (2 by maintainers)
Top GitHub Comments
Hi, I did some more investigation. I have been able to fix these errors to some extent. I am sharing it again here for the benefit of readers. I am able to avoid the previous error by passing the
kuka
environment as adict
variable. This allows me to make use of themake_vec_env
function without any errors. Looking into the source code of this function helped.The code now appears something like this:
I get the following error this time:
So, it turns out that the current
SAC
implementation does not support multiple environments. The code works if I useDummyVecEnv
as follows:Apparently,
PPO
implementation supports multi-processing. The following code seems to work.So, to conclude the discussion,
SAC
does not support multi-processing whilePPO
does. So, there is no problem with the library. I was making mistake in using the interface properly. Thank you for creating this great library. Also, in this post, I demonstrate how to make use of Gym Wrappers and custom policy networks to work withKukaDiverseObject
environment.Regards, Swagat
The branch is up to date with master, it is just that the current multi-env implementation was not made (but can be adapted) for dict-obs. I plan to continue working on that in September.