question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Some questions regarding VecNormalize

See original GitHub issue

According to the docs, when creating custom environments, we should always normalize the observation space. For this, you have the VecNormalize wrapper, which creates a moving average and then normalizes the obs.

Let´s say I have 2 observations: height (m) and weight (kg) of a person. My observation space would be something like a Box with low = [0, 0] and high = [2.5, 300]. But since I’m using a VecNormalize, this isn’t correct anymore, right?

So should I instead change it to low = [-10, -10] and high = [10, 10]? (10 being the default clipping value for VecNormalize)

Another question: when should we normalize the rewards as well? (in the mujoco example shown in the docs you chose to only normalize the observations - why?)

Finally, what’s the purpose of the discount factor? Should it be the same as the discount factor of whatever algorithm we’re using?

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:2
  • Comments:8

github_iconTop GitHub Comments

2reactions
araffincommented, Feb 20, 2020

So should I instead change it to

the boundaries in the observation space does not really matter (for everything that is not images), we usually set them to [-inf, inf].

Another question: when should we normalize the rewards as well? Finally, what’s the purpose of the discount factor? Should it be the same as the discount factor of whatever algorithm we’re using?

Good question, the answer is there: https://github.com/openai/baselines/issues/538 and https://github.com/openai/baselines/issues/629 additional resource: https://github.com/hill-a/stable-baselines/issues/234

Should it be the same as the discount factor of whatever algorithm we’re using?

yes

in the mujoco example shown in the docs you chose to only normalize the observations - why?

We should change that (we would appreciate a PR for that), it is an old example, no real reason for not normalizing the reward too.

0reactions
araffincommented, Feb 26, 2020

Or is that a different kind of normalization?

Layer normalization is quite different, see associated paper: https://arxiv.org/abs/1607.06450 It is there mostly because of the parameter noise exploration for DDPG (cf doc).

I wanted to follow up on this topic to ensure I am implementing VecNormalize properly.

@cevans3098 I can only recommend you to take a look at the rl zoo, you forgot to save and load the VecNormalize in your case.

Closing this issue as the original question was answered.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Vectorized Environments - Stable Baselines - Read the Docs
Vectorized environments are required when using wrappers for frame-stacking or ... It seems that Windows users are experiencing issues with SubprocVecEnv.
Read more >
[petsc-users] Enquiry regarding log summary results
I have some questions: >> >> 1. After combining the matrix, I should have only 1 PETSc matrix. Why >> does it says...
Read more >
11 Interview Questions About Normalization (With Sample ...
Consider these 11 normalization interview questions, including sample answers for reference when you prepare for your next interview: 1. What's ...
Read more >
Stable baselines saving PPO model and retraining it again
The way you saved the model is correct. The training is not a monotonous process: it can also show much worse results after...
Read more >
Hi! can you help me improve my custom env? - Reddit
How can I improve it for learning? Is it a good idea to normalize it with sb3's VecNormalize wrapper? Data: spaces.Dict with tabular...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found