question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] HER+SAC different results to SB2

See original GitHub issue

Hi, I was training on a custom environment on SB2 before and wanted to change to SB3 (mainly because having pytorch would probably be easier for my deployment)

So I trained on SB3 with HER+SAC and the same hyperparameter, but got different results. Is this to be expected due to a different SAC implementation, or what else could be the reason?

SB2 code

env = gym.make('armflex-v4')
eval_env = HERGoalEnvWrapper(env)
eval_callback = EvalCallback(eval_env, best_model_save_path=path,
                                log_path=path, eval_freq=20000,
                                deterministic=True, render=False, n_eval_episodes=15)
model_class = SAC
goal_selection_strategy = 'future'
model = HER('MlpPolicy', env, model_class, n_sampled_goal=4, goal_selection_strategy=goal_selection_strategy, verbose=1, 
    policy_kwargs=dict(layers=[512, 512]), buffer_size=1000000, batch_size=256, gamma=0.99, random_exploration=0.0, 
    ent_coef='auto', gradient_steps=1)
        
model.learn(total_timesteps=TIMESTEPS, callback=eval_callback, log_interval=1)

sb2

SB3 code

    env = make_vec_env(env_name, n_envs=1)
    env = ObsDictWrapper(env)
    eval_callback = EvalCallback(eval_env, best_model_save_path=path,
                            log_path=path, eval_freq=20000,
                            deterministic=True, render=False, n_eval_episodes=15)
    model_class = SAC
    goal_selection_strategy = 'future'
    model = HER('MlpPolicy', env, model_class, n_sampled_goal=4, online_sampling=False, 
        goal_selection_strategy=goal_selection_strategy, verbose=1, policy_kwargs=dict(net_arch=[512, 512]), 
        buffer_size=1000000, batch_size=256, gamma=0.99, ent_coef='auto', gradient_steps=1, max_episode_length=1000)
    
    model.learn(total_timesteps=TIMESTEPS, callback=eval_callback, log_interval=1)

sb3 this should also be the mean over 100 episodes

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:7 (3 by maintainers)

github_iconTop GitHub Comments

1reaction
Ludilucommented, Mar 5, 2021

Thanks, the newer version indeed fixed the issue. Must have installed it just a few days before it was released. I was already able to achieve higher results so I guess this question can be closes. Thanks again for your fast support.

0reactions
araffincommented, Mar 1, 2021

Unfortunately I have a problem with online_sampling=True. I always get the following error at exactly episode = (biffersize/max_episode_length)

Do you have the latest version of Stable-Baselines3? See https://github.com/DLR-RM/stable-baselines3/issues/234

Read more comments on GitHub >

github_iconTop Results From Across the Web

Untitled
2008 jeep patriot for sale ontario, Daily results 49s. ... Tierklinik norderstedt kosten, Bicarbonates ph, The fixx two different views youtube!
Read more >
v0.11.1 PDF - Stable Baselines3 Documentation
Note: Trying to create Atari environments may result to vague errors related to missing DLL files and modules. This is an issue with...
Read more >
Untitled
Lend a hand you hit upon seeing one another?! ... Lesbian bar washington dc, Teen pregnancy interview questions, Support the me detect seeing!?...
Read more >
Untitled
#How Burt reynolds actor wiki, Ddj-sb2 hip hop mix, Gor eranosyan, Payless auto rental ... Mayura restaurant rajajinagar, Different types of manic episodes, ......
Read more >
Full text of "Diary of P.W. Gillette" - Internet Archive
See other formats. ak, ) — ar re ae ie ROLY a ae 7, A ve . | ae a ede Oey see...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found