[rllib] IMPALA can't converge on cluster with Ray 0.6.4
See original GitHub issueSystem information
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
- Ray installed from (source or binary): Binary
- Ray version: 0.6.4
- Python version: 3.6.8
Describe the problem
I have a cluster consist of a p2 head and a c4 worker on AWS. IMPALA can’t converge when I train on the cluster with Ray==0.6.4. I originally run on a smaller computer with Ray==0.6.3, so I tried the old Ray version on the cluster, IMPALA converge again. This was tested on BreakoutNoFrameskip-v4, PongNoFrameskip-v4 and AtlantisNoFrameskip-v4.
Source code / logs
import ray
from ray import tune
from ray.tune.registry import register_env
from ray.rllib.env.atari_wrappers import wrap_deepmind
from ray.rllib.agents.impala import ImpalaAgent
from ray.rllib.agents.ppo import PPOAgent
from ray.rllib.agents.dqn import DQNAgent
ray.init()
''' Breakout Experiment '''
trials = tune.run_experiments({
"breakout": {
"run": "IMPALA",
"env": "BreakoutNoFrameskip-v4",
"checkpoint_freq": 5, # model checkpoint
"stop": {
"timesteps_total": 10000000
},
"config": {
"num_gpus": 1,
"num_workers": 32,
"num_envs_per_worker": 10,
"clip_rewards": True
}
},
}, resume=False)```
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (7 by maintainers)
Top Results From Across the Web
Algorithms — Ray 2.2.0 - the Ray documentation
Defines a configuration class from which an Impala can be built. Example. >>> from ray.rllib.algorithms.impala import ImpalaConfig ...
Read more >Run Reinforcement Learning on Ray Cluster (Microsoft Azure)
When you run training on the cluster, the workload will be submitted in multiple machines and you cannot then use and configure real...
Read more >DeepMind's IMPALA implementation - Google Groups
I'm implementing DeepMind's IMPALA algorithm, which is like A3C except that the local ... The Ray framework seems to be very good with...
Read more >Scaling Deep Reinforcement Learning to a Private Cluster
Using Ray RLlib to train a deep reinforcement learning agent (PPO) in a custom environment on a private cluster.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hey guys, sorry about the issue and you did the good thing by reverting the commit. We did train PongDeterministic-v4 with our changes, not sure if we changed the convergence rate on that one. @stefanpantic and I will take a look at why IMPALA behaves differently with our changes on BreakoutNoFrameskip-v4.
Sorry, false alert. It was my problem, I created a class wrapper around gym.make object, but then
atari_wrapper
cannot properly preprocess the frames.