Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

How to use normalizations in inference?

See original GitHub issue

Important Note: We do not do technical support, nor consulting and don’t answer personal questions per email. Please post your question on the RL Discord, Reddit or Stack Overflow in that case.

📚 Documentation

A clear and concise description of what should be improved in the documentation.

Checklist

I have read the documentation (required)
I have checked that there is no similar issue in the repo (required)

We have used your VecNormalize wrapper and like it a lot, but we are wondering how to use the final normalization in setups that only do inference.

We are exporting them as noted in the documentation:

stats_path = os.path.join("archive/" + log_name_, "vec_normalize.pkl")
env.save(stats_path)

And load them accordingly:

env = DummyVecEnv([OurVeryFancyEnv])
env = VecNormalize.load(stats_path, env)
env.training = False
env.norm_reward = False
model_test = PPO.load("archive/" + log_name_ + "/best_model.zip",env)

Is it truly neccessary to wrap our env in a vectorized environment to load/apply the normalizations for inference? Is the output documented somewhere, can we apply them “manually” on the observations?

Issue Analytics

State:
Created a year ago
Comments:6 (3 by maintainers)

Top GitHub Comments

1reaction

jank324commented, Jul 6, 2022

I’ve had the same issue as @Trolldemorted and implemented a somewhat hacky utility wrapper to solve it.

class NotVecNormalize(gym.Wrapper):
    """
    Normal Gym wrapper that replicates the functionality of Stable Baselines3's VecNormalize wrapper
    for non VecEnvs (i.e. `gym.Env`) in production.
    """
    
    def __init__(self, env, path):
        super().__init__(env)

        with open(path, "rb") as file_handler:
            self.vec_normalize = pickle.load(file_handler)

    def reset(self):
        observation = self.env.reset()
        return self.vec_normalize.normalize_obs(observation)
    
    def step(self, action):
        observation, reward, done, info = self.env.step(action)
        observation = self.vec_normalize.normalize_obs(observation)
        reward = self.vec_normalize.normalize_reward(reward)
        return observation, reward, done, info

I implemented this wrapper, because VecEnv automatically resets environments at the end of an episode. While this behaviour is very nice in training, in production it became a bit of a nuisance. (I am using RL for optimisation and want retain the state achieved by the end of an episode.) Applying the normalisation to a normal gym.Env solves this problem nicely.

I’m not sure if my situation is niche or if there is a better way to disable the automatic reset in VecEnvs, but maybe it would make sense to include a clean version of this utility wrapper in Stable Baselines.

Note: I’m not sure how all of this will be affected by the changes to Gym beyond v0.21 (especially VectorEnv).

0reactions

araffincommented, Jul 6, 2022

Now, if I use a VecEnv, the automatic call to reset resets all the actuators, but what I want is for them to remain on the setting found by the agent.

Sounds like you need to deactivate termination completely at test time (except the first one), or at least deactivate the reset of the actuator after the first episode (should be easy to do) (or deactivate the timeout if that the termination condition you don’t need at test time).