BootDQN+ not matching claimed performance
See original GitHub issueSeveral runs on deep_sea/0 i.e., DeepSea with N=10 take longer than 100 episodes, some even longer than 2^10=1024 episodes when running the default_agent BootDQN with no modifications.
To reproduce, this is the code I am running in Colab with a GPU runtime:
# first install bsuite[baselines]
import bsuite
from bsuite.baselines import experiment
from bsuite.baselines.tf import dqn
from bsuite.baselines.tf import boot_dqn
SAVE_PATH_DQN = './logs/test_boot'
env = bsuite.load_and_record("deep_sea/0", save_path=SAVE_PATH_DQN, overwrite=True)
agent = boot_dqn.default_agent(
obs_spec=env.observation_spec(),
action_spec=env.action_spec()
)
experiment.run(agent, env, num_episodes=env.bsuite_num_episodes)
I reran this multiple times and have had a few runs with > 1024 bad episodes.
Issue Analytics
- State:
- Created 3 years ago
- Comments:17
Top Results From Across the Web
HyperDQN: A Randomized Exploration Method for Deep ...
In particular, the evaluation policy of DoubleDQN at the initial stage is not a random policy so that its performance does not match...
Read more >Randomized Prior Functions for Deep Reinforcement Learning
Figure 5 compares the performance of DQN with '-greedy, bootstrap without prior (BS), bootstrap with prior networks (BSP) and the state-of-the-art continuous ...
Read more >Randomized Prior Functions for Deep Reinforcement Learning
Figure 5 compares the performance of DQN with ϵ-greedy, bootstrap without prior (BS), bootstrap with prior networks (BSP) and the state-of-the- ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Thanks a lot - I’m seeing the same flavour of results as you in that colab.
Interestingly though I do not see this error in our runs inside Google. My suspicion is that something is going wrong with the versioning/export… but at the moment I don’t understand what that is…
We will try and get to the bottom of this ASAP - thank you for raising!
In fact, I’ve run the colab you linked to and everything is fine now!
Seems like the issue was something to do with tensorflow probability versioning and poor installation in colab