Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[rllib] Unable to restore multiagent PPO policy models with tensorflow

See original GitHub issue

What is the problem?

Ray version: 1.0.1 Tensorflow version: 2.3.1 Operative systems tested: Ubuntu 18.04 and MacOS Mojave

Hi, I am trying to export a trained policy in a multiagent environment as a tensorflow model, but it is dropping me an UnliftableError. I tried to simplify the reproduction script as much as possible.

Reproduction (REQUIRED)

from gym.spaces import Discrete

import ray
from ray.rllib.examples.env.rock_paper_scissors import RockPaperScissors
from ray.rllib.agents import ppo


select_policy = lambda agent_id: "policy_01" if agent_id == "player1" else "policy_02"

config = {
    "multiagent": {
        "policies": {
            "policy_01": (None, Discrete(3), Discrete(3), {}),
            "policy_02": (None, Discrete(3), Discrete(3), {}),
        },
        "policy_mapping_fn": select_policy,
    },
}

ray.init()
trainer = ppo.PPOTrainer(env=RockPaperScissors, config=config)
trainer.train()  # Train one step
trainer.export_policy_model("exported_model", "policy_01")

Once the model is saved, try to restore it in tensorflow with the following 2 lines.

import tensorflow as tf
tf.saved_model.load("exported_model")

This drops me the following error:

WARNING:tensorflow:From /Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/ray/rllib/policy/tf_policy.py:653: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/timestep_1:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/kl_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/entropy_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/lr:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/global_step:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/timestep_1:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/kl_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/entropy_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/lr:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/global_step:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Some variables could not be lifted out of a loaded function. Run the tf.initializers.tables_initializer() operation to restore these variables.
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/timestep_1:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/kl_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/entropy_coeff:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/lr:0' shape=() dtype=float32_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
WARNING:tensorflow:Unable to create a python object for variable <tf.Variable 'policy_01/global_step:0' shape=() dtype=int64_ref> because it is a reference variable. It may not be visible to training APIs. If this is a problem, consider rebuilding the SavedModel after running tf.compat.v1.enable_resource_variables().
Traceback (most recent call last):
  File "minimal.py", line 27, in <module>
    tf.saved_model.load("exported_model")
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 603, in load
    return load_internal(export_dir, tags, options)
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load.py", line 649, in load_internal
    root = load_v1_in_v2.load(export_dir, tags)
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 263, in load
    return loader.load(tags=tags)
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 246, in load
    signature_functions = self._extract_signatures(wrapped, meta_graph_def)
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/saved_model/load_v1_in_v2.py", line 158, in _extract_signatures
    signature_fn = wrapped.prune(feeds=feeds, fetches=fetches)
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/eager/wrap_function.py", line 338, in prune
    base_graph=self._func_graph)
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/eager/lift_to_graph.py", line 260, in lift_to_graph
    add_sources=add_sources))
  File "/Users/ivallesp/projects/rockpaperscisors/.venv/lib/python3.7/site-packages/tensorflow/python/ops/op_selector.py", line 413, in map_subgraph
    % (repr(init_tensor), repr(op), _path_from(op, init_tensor, sources)))
tensorflow.python.ops.op_selector.UnliftableError: A SavedModel signature needs an input for each placeholder the signature's outputs use. An output for signature 'serving_default' depends on a placeholder which is not an input (i.e. the placeholder is not fed a value).

Unable to lift tensor <tf.Tensor 'policy_01/cond_2/Merge:0' shape=(?,) dtype=float32> because it depends transitively on placeholder <tf.Operation 'policy_01/timestep' type=Placeholder> via at least one path, e.g.: policy_01/cond_2/Merge (Merge) <- policy_01/cond_2/Switch_1 (Switch) <- policy_01/cond_2/pred_id (Identity) <- policy_01/LogicalAnd (LogicalAnd) <- policy_01/GreaterEqual (GreaterEqual) <- policy_01/timestep (Placeholder)

I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.

Issue Analytics

State:
Created 3 years ago
Comments:7 (5 by maintainers)

Top GitHub Comments

1reaction

sven1977commented, Dec 11, 2020

Thanks for filing this @ivallesp! Taking a look rn. I can reproduce the above error.

0reactions

sven1977commented, Dec 11, 2020

Closing this issue. Feel free to re-open if the above solution does not fix the problem on your end. I was also able to make these Unable to create a python object for variable <tf.Variabl... warnings go away. But these were unrelated to the actual error/crash.

Top Results From Across the Web

Examples — Ray 2.2.0

This blog post is a brief tutorial on multi-agent RL and its design in RLlib. Functional RL with Keras and TensorFlow Eager:.

Environments — Ray 2.2.0

RLlib works with several different types of environments, including OpenAI Gym, user-defined, multi-agent, and also batched environments.

Getting Started with RLlib — Ray 2.2.0 - the Ray documentation

In multi-agent training, the algorithm manages the querying and optimization ... PPO algo = PPO(config=config, env=env_class) algo.restore(checkpoint_path)

Base Policy class (ray.rllib.policy.policy.Policy) — Ray 2.2.0

If None and an Algorithm checkpoint is provided, will restore all policies found in that checkpoint. If a Policy checkpoint is given, this...

Algorithms — Ray 2.2.0 - the Ray documentation

Proximal Policy Optimization (PPO)#. pytorch · tensorflow [paper] [implementation] PPO's clipped objective supports multiple SGD passes over the same batch of ...