[rllib] TF2 TFModelV2 Custom model variables does not appear in `model.variables()`
See original GitHub issue- OS Platform and Distribution: Arch Linux
- Ray installed from (source or binary): installed with pip
- Ray version:
ray, version 1.0.1.post1 - Python version: 3.8.6
- Exact command to reproduce:
python custom_model.py --platform tf2
# custom_model.py
import argparse
import pathlib
import gym
import ray
import ray.rllib.agents.ppo as ppo
from ray import tune
from ray.rllib.models import ModelCatalog
from ray.rllib.models.modelv2 import ModelV2
from ray.rllib.models.tf.misc import normc_initializer
from ray.rllib.models.tf.tf_modelv2 import TFModelV2
from ray.rllib.utils.annotations import override
from ray.rllib.utils.framework import try_import_tf
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("--env", type=str, default="CartPole-v1", help="Gym environment.")
parser.add_argument("--framework", type=str, default="tf2", help="Gym environment.")
parser.add_argument("--logdir", type=str, default='runs', help='Path to the logdir.')
parser.add_argument("--num_workers", type=int, default=6, help='Number of Ray workers.')
parser.add_argument("--num_gpus", type=int, default=0, help='Number available GPUS.')
parser.add_argument("--episode_reward_mean", type=float, default=300, help="Stop criteria.")
args = parser.parse_args()
args.logdir = pathlib.Path(args.logdir)
return args
tf1, tf, tfv = try_import_tf()
class MyKerasModel(TFModelV2):
"""Custom model for policy gradient algorithms."""
def __init__(self, obs_space, action_space, num_outputs, model_config,
name):
super().__init__(obs_space, action_space, num_outputs, model_config, name)
self.inputs = tf.keras.layers.Input(shape=obs_space.shape, name="observations")
# Actor
actor = tf.keras.layers.Dense(64, activation='relu')(self.inputs)
layer_out = tf.keras.layers.Dense(num_outputs, name="my_out", activation=None)(actor)
# Critic
critic = tf.keras.layers.Dense(64, activation='relu')(self.inputs)
value_out = tf.keras.layers.Dense(1, name="value_out", activation=None)(critic)
self.base_model = tf.keras.Model(self.inputs, [layer_out, value_out])
self.register_variables(self.base_model.variables)
@override(ModelV2)
def forward(self, input_dict, state, seq_lens):
model_out, self._value_out = self.base_model(input_dict["obs"])
return model_out, state
@override(ModelV2)
def value_function(self):
return tf.reshape(self._value_out, [-1])
def metrics(self):
return {}
def train(args):
# Register model
ModelCatalog.register_custom_model('keras_model', MyKerasModel)
config = ppo.DEFAULT_CONFIG.copy()
config.update({
'env': args.env,
'framework': args.framework,
'num_gpus': args.num_gpus,
'num_workers': args.num_workers,
'lr': 0.001,
})
config['model'].update({
'custom_model': 'keras_model',
})
stop = {
'episode_reward_mean': args.episode_reward_mean,
}
tune.run(
ppo.PPOTrainer,
config=config,
stop=stop,
local_dir=args.logdir / args.env.lower().replace(' ', '_'),
checkpoint_freq=10,
checkpoint_at_end=True,
)
if __name__ == "__main__":
args = parse_args()
ray.init(address=None)
train(args)
What is the problem?
Getting following ValueError.
File "python/ray/_raylet.pyx", line 443, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 477, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 481, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 482, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 436, in ray._raylet.execute_task.function_executor
File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 456, in __init__
self.policy_map, self.preprocessors = self._build_policy_map(
File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/evaluation/rollout_worker.py", line 1059, in _build_policy_map
policy_map[name] = cls(obs_space, act_space, merged_conf)
File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/policy/eager_tf_policy.py", line 235, in __init__
self.model = ModelCatalog.get_model_v2(
File "/home/juhlik/.cache/pypoetry/virtualenvs/tryout-sqdI2LqI-py3.8/lib/python3.8/site-packages/ray/rllib/models/catalog.py", line 346, in get_model_v2
raise ValueError(
ValueError: It looks like variables {<tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>, <tf.Variable 'default_policy/Variable:0' shape=() dtype=int64, numpy=0>} were created as part of <__main__.MyKerasModel object at 0x7f72d221f040> but does not appear in model.variables() ({<tf.Variable 'dense/kernel:0' shape=(4, 64) dtype=float32, numpy=
array([[ 0.19838217, 0.12977728, 0.13880992, -0.02866092, -0.27592704,
...
dtype=float32)>, <tf.Variable 'dense/bias:0' shape=(64,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>, <tf.Variable 'my_out/bias:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>, <tf.Variable 'value_out/kernel:0' shape=(64, 1) dtype=float32, numpy=
array([[ 0.2723226 ],
...
[-0.18534249]], dtype=float32)>, <tf.Variable 'my_out/kernel:0' shape=(64, 2) dtype=float32, numpy=
array([[ 0.13135597, 0.20843989],
...
[ 0.117394 , 0.24626827]], dtype=float32)>, <tf.Variable 'dense_1/bias:0' shape=(64,) dtype=float32, numpy=
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)>, <tf.Variable 'value_out/bias:0' shape=(1,) dtype=float32, numpy=array([0.], dtype=float32)>}). Did you forget to call model.register_variables() on the variables in question?
Note that the python custom_model.py --platform tf is working without any issue.
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Issue Analytics
- State:
- Created 3 years ago
- Comments:13 (7 by maintainers)
Top Results From Across the Web
Source code for ray.rllib.models.tf.tf_modelv2
[docs]@PublicAPI class TFModelV2(ModelV2): """TF version of ModelV2, ... class by itself is not a valid model unless you implement forward() in a subclass....
Read more >[RLlib] PPO custom model only get flattened observations - Ray
I want to create a custom model for my ppo agent, and it seemed like it should be easy enough. But I have...
Read more >Save model parameters on each checkpoint - Ray Tune
I would like to save the model (.pb, .h5) parameters on each checkpoint as we would like to compare the various stages of...
Read more >ray.rllib.models.catalog — Ray 0.8.7 documentation
This only has an effect when not using a custom model model_kwargs (dict): ... created as part " "of {} but does not...
Read more >Models, Preprocessors, and Action Distributions — Ray 2.2.0
In case, no custom model is specified (see further below on how to customize models), RLlib will pick a default model based on...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found

Surprisingly, commenting out the following made the problem go away. I did this because the callback said “This is no longer required” but I am not entirely sure if this is a solution.
The experiment ran fine, although I did not get good results (but the reason for that is unknown at the moment).
I am having the same issue, 1.1.0, with tensorflow 2.4.1.