MPIAdam synchronization error in PPO1
See original GitHub issueDescribe the bug
A simple run of PPO1 crashes. The assertion thetaroot == thetalocal
fails, and it’s not due to NaNs as the floats differ. This doesn’t happen in baselines.
Code example Minimal reproducible example:
import gym
from stable_baselines.common.policies import MlpPolicy
from stable_baselines import PPO1
env = gym.make("CartPole-v1")
model = PPO1(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=10000)
System Info
- Installed from source in virtual environment
- No GPU
- Python 3.6.5
- mpi4py==3.0.0
- tensorflow==1.8.0
- Open MPI 3.1.1
- commit 4983566292a5d3ae0ed1a6bce84a8ac8278e3de5
Stdout + Traceback
(venv) petersen33md:runs petersen33md$ mpirun -n 2 python ppo1_test.py
********** Iteration 0 ************
…
********** Iteration 6 ************
Optimizing...
pol_surr | pol_entpen | vf_loss | kl | ent
Optimizing...
pol_surr | pol_entpen | vf_loss | kl | ent
-0.00082 | -0.00627 | 117.56442 | 8.56e-05 | 0.62709
-0.00030 | -0.00630 | 128.11664 | 7.79e-05 | 0.63015
Traceback (most recent call last):
File "ppo1_test.py", line 160, in <module>
model.learn(total_timesteps=10000, callback=callback)
File "/Users/petersen33/repositories/stable-baselines/stable_baselines/ppo1/pposgd_simple.py", line 272, in learn
self.adam.update(grad, self.optim_stepsize * cur_lrmult)
File "/Users/petersen33/repositories/stable-baselines/stable_baselines/common/mpi_adam.py", line 48, in update
self.check_synced()
File "/Users/petersen33/repositories/stable-baselines/stable_baselines/common/mpi_adam.py", line 83, in check_synced
assert (thetaroot == thetalocal).all(), (thetaroot, thetalocal)
AssertionError: (array([ 0.04382617, -0.0679653 , -0.11690815, ..., 0.00065254,
0. , 0. ], dtype=float32), array([ 0.04383327, -0.06797152, -0.11691316, ..., 0.00065254,
0. , 0. ], dtype=float32))
Issue Analytics
- State:
- Created 5 years ago
- Comments:7
Top Results From Across the Web
stable_baselines.ppo1.pposgd_simple - Stable Baselines
Source code for stable_baselines.ppo1.pposgd_simple ... import MpiAdam from stable_baselines.common.mpi_moments import mpi_moments from ...
Read more >Troubleshooting General Sync Errors - IBM
The Sync client displays failure to start sync error. When the async binary on the remote computer cannot initialize, the async client gets ......
Read more >Sync Issues folder for Outlook contains warnings such as ...
Discusses that items contain errors in the Sync Issues folder when you use Outlook 2013 or Outlook 2010 together with an Exchange Server...
Read more >Previously had synchronization error, skipping update event
Realm Sync would not import those documents. “Detailed Error: could not convert MongoDB value to Realm payload for { table: StoresDB, path: ...
Read more >What is a "device synchronization error" and how can I stop it ...
When sampling from multiple devices (e.g. PowerLabs, Human NIBP, or Trigno Wireless Devices) within LabChart, considerations must be made to improve ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Update: I spotted the bug. I noticed that the error did not persist when the
PPO1
argumentschedule="constant"
(the default value is"linear"
). Annealing occurs based on the value oftimesteps_so_far
.In baselines,
timesteps_so_far
is calculated by MPI-gathering episodes across all workers. Relevant baselines code here:However, in stable-baselines,
timesteps_so_far
is based on the current worker only (which apparently can differ):The
"total_timesteps"
key (which isn’t in baselines) was added at some point to avoid the “mean of an empty slice” warning when no episodes had completed. But the local values were never MPI-gathered.To fix the bug, I changed the previous line to:
and everything is working fine now. Let me know if you’d like me to submit a PR.
Code and tests updated, closing.