[Question] HER+SAC different results to SB2
See original GitHub issueHi, I was training on a custom environment on SB2 before and wanted to change to SB3 (mainly because having pytorch would probably be easier for my deployment)
So I trained on SB3 with HER+SAC and the same hyperparameter, but got different results. Is this to be expected due to a different SAC implementation, or what else could be the reason?
SB2 code
env = gym.make('armflex-v4')
eval_env = HERGoalEnvWrapper(env)
eval_callback = EvalCallback(eval_env, best_model_save_path=path,
log_path=path, eval_freq=20000,
deterministic=True, render=False, n_eval_episodes=15)
model_class = SAC
goal_selection_strategy = 'future'
model = HER('MlpPolicy', env, model_class, n_sampled_goal=4, goal_selection_strategy=goal_selection_strategy, verbose=1,
policy_kwargs=dict(layers=[512, 512]), buffer_size=1000000, batch_size=256, gamma=0.99, random_exploration=0.0,
ent_coef='auto', gradient_steps=1)
model.learn(total_timesteps=TIMESTEPS, callback=eval_callback, log_interval=1)
SB3 code
env = make_vec_env(env_name, n_envs=1)
env = ObsDictWrapper(env)
eval_callback = EvalCallback(eval_env, best_model_save_path=path,
log_path=path, eval_freq=20000,
deterministic=True, render=False, n_eval_episodes=15)
model_class = SAC
goal_selection_strategy = 'future'
model = HER('MlpPolicy', env, model_class, n_sampled_goal=4, online_sampling=False,
goal_selection_strategy=goal_selection_strategy, verbose=1, policy_kwargs=dict(net_arch=[512, 512]),
buffer_size=1000000, batch_size=256, gamma=0.99, ent_coef='auto', gradient_steps=1, max_episode_length=1000)
model.learn(total_timesteps=TIMESTEPS, callback=eval_callback, log_interval=1)
this should also be the mean over 100 episodes
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (3 by maintainers)
Top Results From Across the Web
Untitled
2008 jeep patriot for sale ontario, Daily results 49s. ... Tierklinik norderstedt kosten, Bicarbonates ph, The fixx two different views youtube!
Read more >v0.11.1 PDF - Stable Baselines3 Documentation
Note: Trying to create Atari environments may result to vague errors related to missing DLL files and modules. This is an issue with...
Read more >Untitled
Lend a hand you hit upon seeing one another?! ... Lesbian bar washington dc, Teen pregnancy interview questions, Support the me detect seeing!?...
Read more >Untitled
#How Burt reynolds actor wiki, Ddj-sb2 hip hop mix, Gor eranosyan, Payless auto rental ... Mayura restaurant rajajinagar, Different types of manic episodes, ......
Read more >Full text of "Diary of P.W. Gillette" - Internet Archive
See other formats. ak, ) — ar re ae ie ROLY a ae 7, A ve . | ae a ede Oey see...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks, the newer version indeed fixed the issue. Must have installed it just a few days before it was released. I was already able to achieve higher results so I guess this question can be closes. Thanks again for your fast support.
Do you have the latest version of Stable-Baselines3? See https://github.com/DLR-RM/stable-baselines3/issues/234