question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[question] Loading the PPO model after training does seem to load the policy

See original GitHub issue

I’ve read similar questions (e.g. #30) that were asked here about loading the model after the training but still, I could not figure out what the problem with my model is. My model doesn’t seem to be using the trained policy/value networks when I run the following. I am not sure what the problem is with my setup and this does not look like a bug but I was wondering if anyone can tell me if what I am doing is incorrect? (check_env(env) does not give me any warning and the custom env is running fine it just makes random decisions although the in-progress training results looks that the agent has learned the task during training)

log_dir = "./PPO/"
os.makedirs(log_dir, exist_ok=True)

env=Monitor(CustomEnv(8090), log_dir)
# I do have have vecNormalization for the training, and the training is done on a vectorized environment
# but the evaluation is done on a single env 
# env = Normalize(env, norm_obs=True, norm_reward=True,
#                    clip_obs=1.)

check_env(env)
model = PPO.load(log_dir + "/rl_model_8080_12000000_steps", env=env, verbose=True, tensorboard_log=log_dir)
model.set_env(env)#although the above should be able to use the env

mean_reward, std_reward = evaluate_policy(model, env,  n_eval_episodes=10)
print(f"mean_reward:{mean_reward:.2f} +/- {std_reward:.2f}")

System Info Describe the characteristic of your environment:

  • conda virtual env,
  • I have 2080 rtx but don’t think it is being used
  • python 3.6.13, stablebaselines 1.1.0, tensorflow 1.14.0, pytorch 1.4.0

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Comments:5

github_iconTop GitHub Comments

1reaction
araffincommented, Sep 21, 2021
0reactions
Milad-Rakhshacommented, Sep 23, 2021

Yes, thanks for helping.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Training the same model after loading. · Issue #30 - GitHub
Hello again,. I was looking into continuing training after loading a model. Simply using model.load("path-to-model") model.learn(total_timesteps= 500000)
Read more >
stable-baselines3 PPO model loaded but not working
I create the PPO model and make it learn for a couple thousand timesteps. Now when I evaluate the policy, the car renders...
Read more >
Reinforcement Learning in Python with Stable Baselines 3
Welcome to part 2 of the reinforcement learning with Stable Baselines 3 tutorials. We left off with training a few models in the...
Read more >
tensorforce/community - Gitter
I save and load model by saved-model but I faced another problem when loading saved-model format, I can't load the saved model.
Read more >
Proximal Policy Optimization — Spinning Up documentation
PPO is an on-policy algorithm. PPO can be used for environments with either discrete or continuous action spaces. The Spinning Up implementation of...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found