question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BootDQN+ not matching claimed performance

See original GitHub issue

Several runs on deep_sea/0 i.e., DeepSea with N=10 take longer than 100 episodes, some even longer than 2^10=1024 episodes when running the default_agent BootDQN with no modifications.

To reproduce, this is the code I am running in Colab with a GPU runtime:

# first install bsuite[baselines]
import bsuite
from bsuite.baselines import experiment
from bsuite.baselines.tf import dqn
from bsuite.baselines.tf import boot_dqn

SAVE_PATH_DQN = './logs/test_boot'
env = bsuite.load_and_record("deep_sea/0", save_path=SAVE_PATH_DQN, overwrite=True)
agent = boot_dqn.default_agent(
      obs_spec=env.observation_spec(),
      action_spec=env.action_spec()
)
experiment.run(agent, env, num_episodes=env.bsuite_num_episodes)

I reran this multiple times and have had a few runs with > 1024 bad episodes.

Issue Analytics

  • State:closed
  • Created 3 years ago
  • Comments:17

github_iconTop GitHub Comments

1reaction
iosbandcommented, Feb 13, 2021

Thanks a lot - I’m seeing the same flavour of results as you in that colab.

Interestingly though I do not see this error in our runs inside Google. My suspicion is that something is going wrong with the versioning/export… but at the moment I don’t understand what that is…

We will try and get to the bottom of this ASAP - thank you for raising!

0reactions
iosbandcommented, Feb 22, 2021

In fact, I’ve run the colab you linked to and everything is fine now!

Seems like the issue was something to do with tensorflow probability versioning and poor installation in colab

Read more comments on GitHub >

github_iconTop Results From Across the Web

HyperDQN: A Randomized Exploration Method for Deep ...
In particular, the evaluation policy of DoubleDQN at the initial stage is not a random policy so that its performance does not match...
Read more >
Randomized Prior Functions for Deep Reinforcement Learning
Figure 5 compares the performance of DQN with '-greedy, bootstrap without prior (BS), bootstrap with prior networks (BSP) and the state-of-the-art continuous ...
Read more >
Randomized Prior Functions for Deep Reinforcement Learning
Figure 5 compares the performance of DQN with ϵ-greedy, bootstrap without prior (BS), bootstrap with prior networks (BSP) and the state-of-the- ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found