question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Environment is reset twice per episode when evaluating policy on DummyVecEnv

See original GitHub issue

The evaluate_policy helper function reset the environment at the start of each episode:

https://github.com/DLR-RM/stable-baselines3/blob/494ebfd20abe90acc136fdaf215c76ec566acd2c/stable_baselines3/common/evaluation.py#L33-L34

But DummyVecEnv automatically resets the environment when step returns done = true:

https://github.com/DLR-RM/stable-baselines3/blob/494ebfd20abe90acc136fdaf215c76ec566acd2c/stable_baselines3/common/vec_env/dummy_vec_env.py#L45-L48

This causes the environment to reset twice per episode when evaluating the policy.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
araffincommented, Jun 24, 2020

Shall it reset all environments, or only reset the ones that need resetting?

It shall reset all envs, you have the env_method() for something more granular. We have to keep in mind that this feature will be used in special cases only and the current behavior work in most cases, so I would avoid overcomplicated things.

1reaction
araffincommented, Jun 22, 2020

To help clarifying how VecEnv works, we could add the reset_automatically=True parameter to step function.

I don’t like changing the api of step() 😕 (which should mimic the gym api) even though I understand your point.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Vectorized Environments - Stable Baselines3 - Read the Docs
When using vectorized environments, the environments are automatically reset at the end of each episode. Thus, the observation returned for the i-th ...
Read more >
Stable Baselines Documentation - Read the Docs
Evaluate the performance using a separate test environment ... the environments are automatically reset at the end of each episode.
Read more >
Stable-Baselines3: Reliable Reinforcement Learning ...
We follow best practices for training and evaluation, such as evaluating in a separate environment, using deterministic evaluation where ...
Read more >
Note for RL Stable Baselines | Super Agents of AI
The policy class to use will be inferred and the environment will be ... per episode self.current_step = 0 def reset(self): """ Reset...
Read more >
Let's train our first Deep Reinforcement Learning agent ...
If the episode is done: We reset the environment to its initial state with observation = env.reset(). Let's look at an example! Make...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found