Dev Observability
Product
Pricing
Docs
Resources
Blog
Company
Debug Wordle

question-mark

Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

FetchPickAndPlace not training using DDPG+HER

See original GitHub issue

I am trying to train FetchPickAndPlace as per https://arxiv.org/pdf/1802.09464.pdf using DDPG+HER, however, regardless of how long I train, agent fails to learn anything. I saw that #198 mentioned that OpenAI used a number of tricks to get it to work. Has anyone had any luck doing so in stable baselines? Thanks!

FetchReach and FetchPush both train fine.

My current hyperparameters:

FetchPickAndPlace-v1:
  n_timesteps: !!float 5e6
  policy: 'MlpPolicy'
  model_class: 'ddpg'
  n_sampled_goal: 4
  goal_selection_strategy: 'future'
  buffer_size: 1000000
  batch_size: 256
  gamma: 1.0
  critic_l2_reg: 1.0
  observation_range: [-200.0, 200.0]
  random_exploration: 0.3
  actor_lr: !!float 1e-3
  critic_lr: !!float 1e-3
  noise_type: 'normal'
  noise_std: 0.2
  normalize_observations: true
  normalize_returns: false
  policy_kwargs: "dict(layers=[256, 256, 256])"

Issue Analytics

State:
Created 4 years ago
Reactions:2
Comments:15

Top GitHub Comments

1reaction

araffincommented, Nov 6, 2019

During testing, all the exploration noise is removed, we use a deterministic policy, hence the difference.

1reaction

fisherxuecommented, Aug 23, 2019

Will do, I’ll get back to you in a few weeks (hardware I would run it on is currently occupied) 😃

Read more comments on GitHub >

Top Results From Across the Web

SAC on FetchPickAndPlace-v1 in ~400k time steps - Reddit

Hello, I'm training my implementation of SAC on the goal-based ... I could not find information about it in the DDPG-HER paper as...

Training curves for DDPG, HER and DtD agents on the Fetch ...

In Hindsight Experience Replay (HER), a reinforcement learning agent is trained by treating whatever it has achieved as virtual goals.

Training a Robotic Arm to do Human-Like Tasks using RL

FetchPickAndPlace -v0: Pick up a box and move it using its gripper to move it to ... Typical reinforcement learning algorithms would not...

Explainable Hierarchical Reinforcement Learning for Robotic ...

interacting with such a system, not knowing how the robot's ... 3: Training curves for DDPG, HER and DtD agents on the. Fetch...

baselines/her · evaluation ...

Training results for Fetch Pick and Place task constrasting between training with and without demonstration data.

Top Related Medium Post

No results found

Top Related StackOverflow Question

No results found

Troubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.

Top Related Reddit Thread

No results found

Top Related Hackernoon Post

No results found

Top Related Tweet

No results found

Top Related Dev.to Post

No results found

Top Related Hashnode Post

No results found

[Feature discussion] Improving model save format

HER model don't learn again after load previously learned model.