Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Environment is reset twice per episode when evaluating policy on DummyVecEnv

See original GitHub issue

The evaluate_policy helper function reset the environment at the start of each episode:

https://github.com/DLR-RM/stable-baselines3/blob/494ebfd20abe90acc136fdaf215c76ec566acd2c/stable_baselines3/common/evaluation.py#L33-L34

But DummyVecEnv automatically resets the environment when step returns done = true:

https://github.com/DLR-RM/stable-baselines3/blob/494ebfd20abe90acc136fdaf215c76ec566acd2c/stable_baselines3/common/vec_env/dummy_vec_env.py#L45-L48

This causes the environment to reset twice per episode when evaluating the policy.

Issue Analytics

State:
Created 3 years ago
Comments:10 (5 by maintainers)

Top GitHub Comments

1reaction

araffincommented, Jun 24, 2020

Shall it reset all environments, or only reset the ones that need resetting?

It shall reset all envs, you have the env_method() for something more granular. We have to keep in mind that this feature will be used in special cases only and the current behavior work in most cases, so I would avoid overcomplicated things.

1reaction

araffincommented, Jun 22, 2020

To help clarifying how VecEnv works, we could add the reset_automatically=True parameter to step function.

I don’t like changing the api of step() 😕 (which should mimic the gym api) even though I understand your point.

Top Results From Across the Web

Vectorized Environments - Stable Baselines3 - Read the Docs

When using vectorized environments, the environments are automatically reset at the end of each episode. Thus, the observation returned for the i-th ...

Stable Baselines Documentation - Read the Docs

Evaluate the performance using a separate test environment ... the environments are automatically reset at the end of each episode.

Stable-Baselines3: Reliable Reinforcement Learning ...

We follow best practices for training and evaluation, such as evaluating in a separate environment, using deterministic evaluation where ...

Note for RL Stable Baselines | Super Agents of AI

The policy class to use will be inferred and the environment will be ... per episode self.current_step = 0 def reset(self): """ Reset...

Let's train our first Deep Reinforcement Learning agent ...

If the episode is done: We reset the environment to its initial state with observation = env.reset(). Let's look at an example! Make...