Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[rllib] TF2 TFModelV2 Custom model variables does not appear in `model.variables()`

See original GitHub issue

OS Platform and Distribution: Arch Linux
Ray installed from (source or binary): installed with pip
Ray version: ray, version 1.0.1.post1
Python version: 3.8.6
Exact command to reproduce: python custom_model.py --platform tf2

# custom_model.py
import argparse
import pathlib

import gym
import ray
import ray.rllib.agents.ppo as ppo
from ray import tune
from ray.rllib.models import ModelCatalog
from ray.rllib.models.modelv2 import ModelV2
from ray.rllib.models.tf.misc import normc_initializer
from ray.rllib.models.tf.tf_modelv2 import TFModelV2
from ray.rllib.utils.annotations import override
from ray.rllib.utils.framework import try_import_tf


def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--env", type=str, default="CartPole-v1", help="Gym environment.")
    parser.add_argument("--framework", type=str, default="tf2", help="Gym environment.")
    parser.add_argument("--logdir", type=str, default='runs', help='Path to the logdir.')
    parser.add_argument("--num_workers", type=int, default=6, help='Number of Ray workers.')
    parser.add_argument("--num_gpus", type=int, default=0, help='Number available GPUS.')
    parser.add_argument("--episode_reward_mean", type=float, default=300, help="Stop criteria.")
    args = parser.parse_args()
    args.logdir = pathlib.Path(args.logdir)
    return args


tf1, tf, tfv = try_import_tf()


class MyKerasModel(TFModelV2):
    """Custom model for policy gradient algorithms."""

    def __init__(self, obs_space, action_space, num_outputs, model_config,
                 name):
        super().__init__(obs_space, action_space, num_outputs, model_config, name)
        self.inputs = tf.keras.layers.Input(shape=obs_space.shape, name="observations")
        # Actor
        actor = tf.keras.layers.Dense(64, activation='relu')(self.inputs)
        layer_out = tf.keras.layers.Dense(num_outputs, name="my_out", activation=None)(actor)

        # Critic
        critic = tf.keras.layers.Dense(64, activation='relu')(self.inputs)
        value_out = tf.keras.layers.Dense(1, name="value_out", activation=None)(critic)

        self.base_model = tf.keras.Model(self.inputs, [layer_out, value_out])
        self.register_variables(self.base_model.variables)

    @override(ModelV2)
    def forward(self, input_dict, state, seq_lens):
        model_out, self._value_out = self.base_model(input_dict["obs"])
        return model_out, state

    @override(ModelV2)
    def value_function(self):
        return tf.reshape(self._value_out, [-1])

    def metrics(self):
        return {}


def train(args):
    # Register model
    ModelCatalog.register_custom_model('keras_model', MyKerasModel)

    config = ppo.DEFAULT_CONFIG.copy()
    config.update({
        'env': args.env,
        'framework': args.framework,
        'num_gpus': args.num_gpus,
        'num_workers': args.num_workers,
        'lr': 0.001,
    })

    config['model'].update({
        'custom_model': 'keras_model',
    })

    stop = {
        'episode_reward_mean': args.episode_reward_mean,
    }

    tune.run(
        ppo.PPOTrainer,
        config=config,
        stop=stop,
        local_dir=args.logdir / args.env.lower().replace(' ', '_'),
        checkpoint_freq=10,
        checkpoint_at_end=True,
    )


if __name__ == "__main__":
    args = parse_args()
    ray.init(address=None)
    train(args)

What is the problem?

Getting following ValueError.

 File "python/ray/_raylet.pyx", line 443, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 477, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 481, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 482, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 436, in ray._raylet.execute_task.function_executor
  File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 456, in __init__
    self.policy_map, self.preprocessors = self._build_policy_map(
  File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1059, in _build_policy_map
    policy_map[name] = cls(obs_space, act_space, merged_conf)
  File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/policy/eager_tf_policy.py", line 235, in __init__
    self.model = ModelCatalog.get_model_v2(
  File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/models/catalog.py", line 346, in get_model_v2
    raise ValueError(
ValueError: It looks like variables {<tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>} were created as part of <__main__.MyKerasModel object at 0x7f72d221f040> but does not appear in model.variables() ({<tf.Variable 'dense/kernel:0' shape=(4, 64) dtype=float32, numpy=
array([[ 0.19838217,  0.12977728,  0.13880992, -0.02866092, -0.27592704,
         ...
      dtype=float32)>, <tf.Variable 'dense/bias:0' shape=(64,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>, <tf.Variable 'my_out/bias:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>, <tf.Variable 'value_out/kernel:0' shape=(64, 1) dtype=float32, numpy=
array([[ 0.2723226 ],
   ...
       [-0.18534249]], dtype=float32)>, <tf.Variable 'my_out/kernel:0' shape=(64, 2) dtype=float32, numpy=
array([[ 0.13135597,  0.20843989],
       ...
       [ 0.117394  ,  0.24626827]], dtype=float32)>, <tf.Variable 'dense_1/bias:0' shape=(64,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>, <tf.Variable 'value_out/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>}). Did you forget to call model.register_variables() on the variables in question?

Note that the python custom_model.py --platform tf is working without any issue.

I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.

Issue Analytics

State:
Created 3 years ago
Comments:13 (7 by maintainers)

Top GitHub Comments

1reaction

rfalicommented, May 3, 2021

Surprisingly, commenting out the following made the problem go away. I did this because the callback said “This is no longer required” but I am not entirely sure if this is a solution.

 self.register_variables(self.base_model.variables)

The experiment ran fine, although I did not get good results (but the reason for that is unknown at the moment).

1reaction

rw-andersoncommented, Feb 5, 2021

I am having the same issue, 1.1.0, with tensorflow 2.4.1.

Top Results From Across the Web

Source code for ray.rllib.models.tf.tf_modelv2

[docs]@PublicAPI class TFModelV2(ModelV2): """TF version of ModelV2, ... class by itself is not a valid model unless you implement forward() in a subclass....

[RLlib] PPO custom model only get flattened observations - Ray

I want to create a custom model for my ppo agent, and it seemed like it should be easy enough. But I have...

Save model parameters on each checkpoint - Ray Tune

I would like to save the model (.pb, .h5) parameters on each checkpoint as we would like to compare the various stages of...

ray.rllib.models.catalog — Ray 0.8.7 documentation

This only has an effect when not using a custom model model_kwargs (dict): ... created as part " "of {} but does not...

Models, Preprocessors, and Action Distributions — Ray 2.2.0

In case, no custom model is specified (see further below on how to customize models), RLlib will pick a default model based on...