FetchPickAndPlace not training using DDPG+HER
See original GitHub issueI am trying to train FetchPickAndPlace as per https://arxiv.org/pdf/1802.09464.pdf using DDPG+HER, however, regardless of how long I train, agent fails to learn anything. I saw that #198 mentioned that OpenAI used a number of tricks to get it to work. Has anyone had any luck doing so in stable baselines? Thanks!
FetchReach and FetchPush both train fine.
My current hyperparameters:
FetchPickAndPlace-v1:
n_timesteps: !!float 5e6
policy: 'MlpPolicy'
model_class: 'ddpg'
n_sampled_goal: 4
goal_selection_strategy: 'future'
buffer_size: 1000000
batch_size: 256
gamma: 1.0
critic_l2_reg: 1.0
observation_range: [-200.0, 200.0]
random_exploration: 0.3
actor_lr: !!float 1e-3
critic_lr: !!float 1e-3
noise_type: 'normal'
noise_std: 0.2
normalize_observations: true
normalize_returns: false
policy_kwargs: "dict(layers=[256, 256, 256])"
Issue Analytics
- State:
- Created 4 years ago
- Reactions:2
- Comments:15
Top Results From Across the Web
SAC on FetchPickAndPlace-v1 in ~400k time steps - Reddit
Hello, I'm training my implementation of SAC on the goal-based ... I could not find information about it in the DDPG-HER paper as...
Read more >Training curves for DDPG, HER and DtD agents on the Fetch ...
In Hindsight Experience Replay (HER), a reinforcement learning agent is trained by treating whatever it has achieved as virtual goals.
Read more >Training a Robotic Arm to do Human-Like Tasks using RL
FetchPickAndPlace -v0: Pick up a box and move it using its gripper to move it to ... Typical reinforcement learning algorithms would not...
Read more >Explainable Hierarchical Reinforcement Learning for Robotic ...
interacting with such a system, not knowing how the robot's ... 3: Training curves for DDPG, HER and DtD agents on the. Fetch...
Read more >baselines/her · evaluation ...
Training results for Fetch Pick and Place task constrasting between training with and without demonstration data.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
During testing, all the exploration noise is removed, we use a deterministic policy, hence the difference.
Will do, I’ll get back to you in a few weeks (hardware I would run it on is currently occupied) 😃