Non-deterministic behaviour when ran on GPU
See original GitHub issueThe following commit https://github.com/openai/baselines/commit/9fa8e1baf1d1f975b87b369a8082122eac812eb1#diff-fc3e1c3522d2c7871bda86ed40bcb0ddL28 introduced non-deterministic behavior of PPO1 when ran on GPU even with setting tf.set_random_seed (CPU behavior is deterministic). Specifically, at line 28 and others in mlp_policy.py replacing
U.dense(last_out, hid_size, name='fc%i'%(i+1), weight_init=U.normc_initializer(1.0))
with
tf.layers.dense(last_out, hid_size, name='fc%i'%(i+1), kernel_initializer=U.normc_initializer(1.0))
created this behavior. Below are 4 runs of Mujoco Swimmer-v2 environment with the same random seed using PPO1 in latest version of baselines code swimmer_same_seed_new_code.pdf
Replacing all instances of tf.layers.dense with U.dense, and adding the corresponding code
def dense(x, size, name, weight_init=None, bias=True):
w = tf.get_variable(name + "/w", [x.get_shape()[1], size], initializer=weight_init)
ret = tf.matmul(x, w)
if bias:
b = tf.get_variable(name + "/b", [size], initializer=tf.zeros_initializer())
return ret + b
else:
return ret
back to tf_utils.py fixes the issue. Below is a figure with 4 Swimmer runs after this change
swimmer_same_seed_old_code.pdf
All experiments were run using
tensorflow-gpu==1.12.0
cudatoolkit==9.2
cudnn==7.3.1
Issue Analytics
- State:
- Created 5 years ago
- Comments:11 (6 by maintainers)
Top GitHub Comments
GPU calculations are non-deterministic because the thread scheduling is non-deterministic. Floating-point errors are accumulated in unpredictable ways for operations that are not associative – a consequence of the GPU hardware itself, not TensorFlow.
This same phenomenon would occur on a multi-core CPU too, but I believe TensorFlow typically does not parallelize operations that lose determinism when using a CPU because the performance loss is minimal. This is why your CPU output is deterministic.
You can read these links for more info:
I think you should open a pull request with those changes (I can do it if you want). The owners can merge it if they approve it.