[rllib] Unable to restore multiagent PPO policy models with tensorflow
See original GitHub issueWhat is the problem?
Ray version: 1.0.1 Tensorflow version: 2.3.1 Operative systems tested: Ubuntu 18.04 and MacOS Mojave
Hi, I am trying to export a trained policy in a multiagent environment as a tensorflow model, but it is dropping me an UnliftableError. I tried to simplify the reproduction script as much as possible.
Reproduction (REQUIRED)
from gym.spaces import Discrete
import ray
from ray.rllib.examples.env.rock_paper_scissors import RockPaperScissors
from ray.rllib.agents import ppo
select_policy = lambda agent_id: "policy_01" if agent_id == "player1" else "policy_02"
config = {
"multiagent": {
"policies": {
"policy_01": (None, Discrete(3), Discrete(3), {}),
"policy_02": (None, Discrete(3), Discrete(3), {}),
},
"policy_mapping_fn": select_policy,
},
}
ray.init()
trainer = ppo.PPOTrainer(env=RockPaperScissors, config=config)
trainer.train() # Train one step
trainer.export_policy_model("exported_model", "policy_01")
Once the model is saved, try to restore it in tensorflow with the following 2 lines.
import tensorflow as tf
tf.saved_model.load("exported_model")
This drops me the following error:
WARNING:tensorflow:From /Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/ray/rllib/policy/tf_policy.py:653: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/timestep_1:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/kl_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/entropy_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/lr:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/global_step:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/timestep_1:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/kl_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/entropy_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/lr:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/global_step:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Some variables could not be lifted out of a loaded function. Run the tf.initializers.tables_initializer() operation to restore these variables.
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/timestep_1:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/kl_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/entropy_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/lr:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/global_step:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
Traceback (most recent call last):
File "minimal.py", line 27, in <module>
tf.saved_model.load("exported_model")
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 603, in load
return load_internal(export_dir, tags, options)
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 649, in load_internal
root = load_v1_in_v2.load(export_dir, tags)
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 263, in load
return loader.load(tags=tags)
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 246, in load
signature_functions = self._extract_signatures(wrapped, meta_graph_def)
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 158, in _extract_signatures
signature_fn = wrapped.prune(feeds=feeds, fetches=fetches)
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/eager/wrap_function.py", line 338, in prune
base_graph=self._func_graph)
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/eager/lift_to_graph.py", line 260, in lift_to_graph
add_sources=add_sources))
File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/ops/op_selector.py", line 413, in map_subgraph
% (repr(init_tensor), repr(op), _path_from(op, init_tensor, sources)))
tensorflow.python.ops.op_selector.UnliftableError: A SavedModel signature needs an input for each placeholder the signature's outputs use. An output for signature 'serving_default' depends on a placeholder which is not an input (i.e. the placeholder is not fed a value).
Unable to lift tensor <tf.Tensor 'policy_01/cond_2/Merge:0' shape=(?,) dtype=float32> because it depends transitively on placeholder <tf.Operation 'policy_01/timestep' type=Placeholder> via at least one path, e.g.: policy_01/cond_2/Merge (Merge) <- policy_01/cond_2/Switch_1 (Switch) <- policy_01/cond_2/pred_id (Identity) <- policy_01/LogicalAnd (LogicalAnd) <- policy_01/GreaterEqual (GreaterEqual) <- policy_01/timestep (Placeholder)
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (5 by maintainers)
Top Results From Across the Web
Examples — Ray 2.2.0
This blog post is a brief tutorial on multi-agent RL and its design in RLlib. Functional RL with Keras and TensorFlow Eager:.
Read more >Environments — Ray 2.2.0
RLlib works with several different types of environments, including OpenAI Gym, user-defined, multi-agent, and also batched environments.
Read more >Getting Started with RLlib — Ray 2.2.0 - the Ray documentation
In multi-agent training, the algorithm manages the querying and optimization ... PPO algo = PPO(config=config, env=env_class) algo.restore(checkpoint_path)
Read more >Base Policy class (ray.rllib.policy.policy.Policy) — Ray 2.2.0
If None and an Algorithm checkpoint is provided, will restore all policies found in that checkpoint. If a Policy checkpoint is given, this...
Read more >Algorithms — Ray 2.2.0 - the Ray documentation
Proximal Policy Optimization (PPO)#. pytorch · tensorflow [paper] [implementation] PPO's clipped objective supports multiple SGD passes over the same batch of ...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Thanks for filing this @ivallesp! Taking a look rn. I can reproduce the above error.
Closing this issue. Feel free to re-open if the above solution does not fix the problem on your end. I was also able to make these
Unable to create a python object for variable <tf.Variabl...warnings go away. But these were unrelated to the actual error/crash.