steps_per_epoch in DDPG.
See original GitHub issueHi, I saw in openai spinups
spinup.ddpg_tf1(..., steps_per_epoch=4000, epochs=100, ...)
which specifies the number of steps in each episode/epoch. Is there a similar setting in stable_baselines? Thanks!
Issue Analytics
- State:
- Created 3 years ago
- Comments:8
Top Results From Across the Web
garage.tf.algos.ddpg module - Read the Docs
Deep Deterministic Policy Gradient (DDPG) implementation in TensorFlow. class DDPG (env_spec, policy, qf, replay_buffer, *, steps_per_epoch=20, ...
Read more >Deep Deterministic Policy Gradient - Spinning Up in Deep RL!
Deep Deterministic Policy Gradient (DDPG) is an algorithm which concurrently learns a Q-function and a policy. It uses off-policy data and the Bellman ......
Read more >deeprl.agents.DDPG Example - Program Talk
DDPG. Learn how to use python api deeprl.agents.DDPG. ... def ddpg(output_dir, seed, env_name='Swimmer-v2', hidden_sizes=(400, 300), steps_per_epoch=5000, ...
Read more >Proximal Policy Optimization - Keras
Hyperparameters of the PPO algorithm steps_per_epoch = 4000 epochs = 30 gamma = 0.99 clip_ratio = 0.2 policy_learning_rate = 3e-4 ...
Read more >Implementing Spinningup Pytorch DDPG for Cartpole-v0 ...
I am trying to implement DDPG for the cartpole problem from here: ... MLPActorCritic, ac_kwargs=dict(), seed=0, steps_per_epoch=4000, ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
T usually, and in this case, signifies the end of the episode. So the action selection, storing, network optimisation and target update occurs once per environment step. So when the episode has finished, the noise and the environment are reset. This is done here:
https://github.com/hill-a/stable-baselines/blob/950c2a5bf95a9fa908be26fd5db11aa60cfa2b2a/stable_baselines/ddpg/ddpg.py#L831-L847
and here:
https://github.com/hill-a/stable-baselines/blob/950c2a5bf95a9fa908be26fd5db11aa60cfa2b2a/stable_baselines/ddpg/ddpg.py#L934-L951
Alright, thanks a lot @Solliet @Miffyli @araffin . I will try with TD3 and SAC.