question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] RL Zoo uses dummy vec env by default?

See original GitHub issue

Question

I’m trying to understand how parallelism is implement is stable-baselines.

By default, as in train.py in RL Zoo, it seems like DummyVecEnv is used because

However, by my understanding of the documentation, DummyVecEnv calls each env in sequence, and hence we shouldn’t expect any speedup from it.

Therefore, I have three quick questions:

  1. Is there a reason why SubprocVecEnv is not used here? I mean, isn’t it faster?
  2. If DummyVecEnv was used, how long did training take (approximately) for 1 million timesteps for, let’s say, halfcheetah-v2? Was the training duration reasonable?
  3. Is VecNormalize used by default in both cases?

I have read through the documentation but couldn’t find answers on these. Thanks for the help in advance.

Checklist

  • I have read the documentation (required)
  • I have checked that there is no similar issue in the repo (required)

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:7 (2 by maintainers)

github_iconTop GitHub Comments

1reaction
araffincommented, Nov 3, 2021

What are some characteristics of an environment that affect the VecEnv choice?

it depends on where is the bottleneck. It is common that the reset() is the most costly operation (may take some seconds in robotic environments), in that case it is recommended to use sub-processes to do the reset in parallel.

the problem with sub-process is that it comes with a communication overhead (as shown in the colab linked in the documentation and by @Miffyli ) so for env that does require heavy computation like pendulum/cartpole, it does not make sense to use sub processes. However, you may observe a good speedup with atari games for instance (given that you have a good cpu).

Although, at some point, it also does not make sense to add more envs for two reasons (also discussed in the colab):

  • adding more envs favor exploration and make the whole training less sample efficient
  • the bottlebeck will comes from the gradient update (done sequentially and not asynchronously)

this highly depends on your environment (and on your hyperparameters).

regarding the hyperparameters, I’m thinking of n_epochs and train_freq / gradient_steps which control a compromise between data collection and gradient update. Again, if the time spent doing gradient update is much greater than the data collection, it does not really make sense to add more envs.

1reaction
Miffylicommented, Nov 2, 2021

Now I understand that overhead might be a problem for SubprocVenEnv and might make it worse than DummyVecEnv. But, for example, is DummyVecEnv of 10 envs faster than running 1 envs for 10 times longer? If so, why? I guess this confuses me the most.

One main contributor may be less calls to the agent predict function which is used to get actions during rollouts (we query actions for all environments with one predict call, regardless of the number of the environments). With more envs you better utilize the parallel nature of networks (bigger batches), where you have roughly the same runtime for predicting actions for a single environment or for eight. You will also end up with bigger batches of data before doing training, so there is less overhead between preparing rollout buffers and whatnot.

I checked the documentation for such arguments, but didn’t find any.

Ah sorry, I should have been more clear: I thought we had more documentation on this but it turns we do not 😃

Read more comments on GitHub >

github_iconTop Results From Across the Web

Question on VecNormalize · Issue #779 · hill-a/stable-baselines
Hi, I trained with TD3 using VecNormalize: env0 = DummyVecEnv([lambda: gym.make("merging-v0")]) env = VecNormalize(env0, norm_obs=True, ...
Read more >
Vectorized Environments - Stable Baselines - Read the Docs
Vectorized Environments are a method for stacking multiple independent environments into a single environment. Instead of training an RL agent on 1 ...
Read more >
Stable-Baselines3: Reliable Reinforcement Learning ...
RL Baselines Zoo provides scripts to train and evaluate agents, tune hyperparameters, record videos, store experiment setup and visualize ...
Read more >
Using RL-Baselines3-Zoo at Hugging Face
rl -baselines3-zoo is a training framework for Reinforcement Learning using Stable Baselines3. Exploring RL-Baselines3-Zoo in the Hub. You can find RL-Baselines3 ...
Read more >
Stable Baselines Documentation - Read the Docs
from stable_baselines.common.vec_env import DummyVecEnv ... Please use the hyperparameters in the RL zoo for best results.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found