question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] Huge performance difference with different n_envs

See original GitHub issue

Question

I’m running a2c with default parameters on BreakoutNoFrameskip-v4, with two different training scenario, where the only difference is that one uses n_envs=16 (orange) while the other one set n_envs=40 (blue). However the performance difference is huge. Is there a particular reason behind this behavior? I thought n_envs is more of a parallelization parameter, which shouldn’t have such a huge impact on performance. image

Additional context

config.yml:

!!python/object/apply:collections.OrderedDict
- - - ent_coef
    - 0.01
  - - env_wrapper
    - - stable_baselines3.common.atari_wrappers.AtariWrapper
  - - frame_stack
    - 4
  - - n_envs
    - 40 (or 16)
  - - n_timesteps
    - 10000000.0
  - - policy
    - CnnPolicy
  - - policy_kwargs
    - dict(optimizer_class=RMSpropTFLike, optimizer_kwargs=dict(eps=1e-5))
  - - vf_coef
    - 0.25

args.yml:

!!python/object/apply:collections.OrderedDict
- - - algo
    - a2c
  - - env
    - BreakoutNoFrameskip-v4
  - - env_kwargs
    - null
  - - eval_episodes
    - 5
  - - eval_freq
    - 10000
  - - gym_packages
    - []
  - - hyperparams
    - null
  - - log_folder
    - logs
  - - log_interval
    - -1
  - - n_eval_envs
    - 1
  - - n_evaluations
    - 20
  - - n_jobs
    - 1
  - - n_startup_trials
    - 10
  - - n_timesteps
    - -1
  - - n_trials
    - 10
  - - no_optim_plots
    - false
  - - num_threads
    - -1
  - - optimization_log_path
    - null
  - - optimize_hyperparameters
    - false
  - - pruner
    - median
  - - sampler
    - tpe
  - - save_freq
    - -1
  - - save_replay_buffer
    - false
  - - seed
    - 0
  - - storage
    - null
  - - study_name
    - null
  - - tensorboard_log
    - ''
  - - trained_agent
    - ''
  - - truncate_last_trajectory
    - true
  - - uuid
    - false
  - - vec_env
    - dummy
  - - verbose
    - 1

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:8 (1 by maintainers)

github_iconTop GitHub Comments

1reaction
qgallouedeccommented, Feb 10, 2022

As far as I know, you are indeed reporting a real bug. Varying the number of environments should have a limited impact on the result. For what it’s worth, I reproduced the same curve as you, which seems to confirm that we are not within the error bands.

Screenshot 2022-02-10 at 17 43 22
1reaction
qgallouedeccommented, Feb 9, 2022
Read more comments on GitHub >

github_iconTop Results From Across the Web

What is the difference between NaN and None? - Stack Overflow
NaN is used as a placeholder for missing data consistently in pandas, consistency is good. I usually read/translate NaN as "missing".
Read more >
What are the reasons for great performance differences ...
Performance measurement is a surprisingly hard topic and huge differences such as ... Using two different machines to compare a program is out...
Read more >
Performance of PPO on Pong #185 - openai/baselines - GitHub
Hello, I can run_atari.py provided in ppo1 without experiencing any problem. However, the mean episodic reward remains at a level of 20+ ...
Read more >
Observational Overfitting in Reinforcement Learning - Behnam ...
We discuss realistic instances where observational overfitting may occur and its difference from other confounding factors, and design a parametric ...
Read more >
EnvPool, a highly parallel reinforcement learning environment ...
the RL environment simulation speed across different hardware ... of large-scale distributed systems and advanced AI chips like TPUs [16].
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found