question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Cannot reproduce the benchmark results of DQN on Breakout

See original GitHub issue

I use the instruction below to train DQN on breakout environment, here is my instruction:

python3 -m baselines.run --alg=deepq --env=BreakoutNoFrameskip-v4 --num_timesteps=10000000

And at the end of training, I could only get 14-15 for 100 episode reward mean, I want to know how could I reproduce the results?

Issue Analytics

  • State:closed
  • Created 5 years ago
  • Comments:9

github_iconTop GitHub Comments

1reaction
DanielTakeshicommented, Oct 31, 2018

@bywbilly Did you use the exploration schedule I had earlier? I probably should have made it clear, the exploration schedule is shown in the lower left plot in my figure above. Here it is in my actual code:

    exploration = PiecewiseSchedule([
            (0,        1.0),
            (int(1e6), 0.1),
            (int(1e7), 0.01)
    ], outside_value=0.01)
1reaction
DanielTakeshicommented, Oct 23, 2018

@bywbilly

To get DQN to work you need to adjust hyperparameters that are different from what’s the default in Baselines.

I got Breakout to work several times for different random seeds, all within the past week from master. Here is one example of a training curve I have with a code base I’m testing with (the top left is probably what you want, past 100 episode reward):

fig_train_results_deepq-01cpu_breakout_2018-10-20-16-35_001

Off the top of my head:

  • Use the exploration schedule above or something closer to what OpenAI was using before they changed DQN around Oct 2017
  • Replay buffer size: 1e6
  • Learning starts: 80k
  • Update target net: 40k
  • Adam lr 1e-4, adam epsilon 1e-4

edit: this is PDD-DQN, just to be clear. I ran for 2.5e7 steps.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Cannot reproduce Breakout benchmark using Double DQN
I haven't been able to reproduce the results of the Breakout benchmark with Double DQN when using similar hyperparameter values than the ones...
Read more >
Need some help with my Double DQN implementation which ...
I'm trying to replicate the Mnih et al. 2015/Double DQN results on Atari Breakout but the per-episode rewards (where one episode is a...
Read more >
DQN — Stable Baselines3 1.7.0a5 documentation
The complete learning curves are available in the associated PR #110. How to replicate the results?¶. Clone the rl-zoo repo: git clone https://github ......
Read more >
Atari score vs reward in rllib DQN implementation
While the average score of 2 is not much at all relative to the benchmarks for Breakout, 5M steps may not be large...
Read more >
DQN in Pytorch Stream 2 of N - YouTube
In part two of my DQN series we will focus on optimizations. I will put the model and training onto my GPU, we...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found