Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DQN does not allow custom models

See original GitHub issue

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
Ray installed from (source or binary): source
Ray version: 0.8.0.dev6
Python version: 3.7.5
Exact command to reproduce:

The following code tries to set built-in VisionNetwork from TF as custom model and it errors out as described below. However, the code succeeds if custom model was not set in which case exact same VisionNetwork gets selected automatically by _get_v2_model. The cause of this issue is explained below however I’m not sure about the fix.

import ray
from ray.rllib.agents.dqn import DQNTrainer
from ray.rllib.models import ModelCatalog
from ray.rllib.models.tf.visionnet_v2 import VisionNetwork

ModelCatalog.register_custom_model("my_model", VisionNetwork)

config = {'model': {
            "custom_model": "my_model",
            "custom_options": {},  # extra options to pass to your model
        }}
ray.init()

agent = DQNTrainer(config=config, env="BreakoutNoFrameskip-v4")

Describe the problem

Current code in master is not allowing the use of custom models in DQN. When trying to use custom model (either for TF or PyTorch), error is thrown indicating that model has not been subclassed from DistributionalQModel. This happens even when custom model is set to simply ray.rllib.models.tf.visionnet_v2.VisionNetwork.

Error message:

'The given model must subclass', <class 'ray.rllib.agents.dqn.distributional_q_model.DistributionalQModel'>)

Source code / logs

Cause of this issue is this check. Notice that this check is only done if custom_model is set. Apparently built-in models don’t subclass DistributionalQModel either however as this check is not applied to built-in models they work fine.

Issue Analytics

State:
Created 4 years ago
Comments:6 (4 by maintainers)

Top GitHub Comments

1reaction

arunavo4commented, Nov 9, 2019

@sytelus Hey I ran into this exact issue a few days back and all I subclassed the right Model and everything works as expected. Copy-paste this Model code below.

# ============== VisionNetwork Model ==================

class VisionNetwork(DistributionalQModel):
    """Generic vision network implemented in DistributionalQModel API."""

    def __init__(self, obs_space, action_space, num_outputs, model_config,
                 name, **kw):
        super(VisionNetwork, self).__init__(
            obs_space, action_space, num_outputs, model_config, name, **kw)

        activation = get_activation_fn(model_config.get("conv_activation"))
        filters = model_config.get("conv_filters")
        if not filters:
            filters = _get_filter_config(obs_space.shape)
        no_final_linear = model_config.get("no_final_linear")
        vf_share_layers = model_config.get("vf_share_layers")

        inputs = tf.keras.layers.Input(
            shape=obs_space.shape, name="observations")
        last_layer = inputs

        # Build the action layers
        for i, (out_size, kernel, stride) in enumerate(filters[:-1], 1):
            last_layer = tf.keras.layers.Conv2D(
                out_size,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="same",
                name="conv{}".format(i))(last_layer)
        out_size, kernel, stride = filters[-1]
        if no_final_linear:
            # the last layer is adjusted to be of size num_outputs
            last_layer = tf.keras.layers.Conv2D(
                num_outputs,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="valid",
                name="conv_out")(last_layer)
            conv_out = last_layer
        else:
            last_layer = tf.keras.layers.Conv2D(
                out_size,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="valid",
                name="conv{}".format(i + 1))(last_layer)
            conv_out = tf.keras.layers.Conv2D(
                num_outputs, [1, 1],
                activation=None,
                padding="same",
                name="conv_out")(last_layer)

        # Build the value layers
        if vf_share_layers:
            last_layer = tf.keras.layers.Lambda(
                lambda x: tf.squeeze(x, axis=[1, 2]))(last_layer)
            value_out = tf.keras.layers.Dense(
                1,
                name="value_out",
                activation=None,
                kernel_initializer=normc_initializer(0.01))(last_layer)
        else:
            # build a parallel set of hidden layers for the value net
            last_layer = inputs
            for i, (out_size, kernel, stride) in enumerate(filters[:-1], 1):
                last_layer = tf.keras.layers.Conv2D(
                    out_size,
                    kernel,
                    strides=(stride, stride),
                    activation=activation,
                    padding="same",
                    name="conv_value_{}".format(i))(last_layer)
            out_size, kernel, stride = filters[-1]
            last_layer = tf.keras.layers.Conv2D(
                out_size,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="valid",
                name="conv_value_{}".format(i + 1))(last_layer)
            last_layer = tf.keras.layers.Conv2D(
                1, [1, 1],
                activation=None,
                padding="same",
                name="conv_value_out")(last_layer)
            value_out = tf.keras.layers.Lambda(
                lambda x: tf.squeeze(x, axis=[1, 2]))(last_layer)

        self.base_model = tf.keras.Model(inputs, [conv_out, value_out])
        self.register_variables(self.base_model.variables)

    def forward(self, input_dict, state, seq_lens):
        # explicit cast to float32 needed in eager
        model_out, self._value_out = self.base_model(
            tf.cast(input_dict["obs"], tf.float32))
        return tf.squeeze(model_out, axis=[1, 2]), state

    def value_function(self):
        return tf.reshape(self._value_out, [-1])


# ================== Register Custom Model ======================
ModelCatalog.register_custom_model("NatureCNN", VisionNetwork)`

0reactions

AmeerHajAlicommented, Nov 9, 2019

That sounds good!

Top Results From Across the Web

DQN — Stable Baselines 2.10.3a0 documentation

The DQN model does not support stable_baselines.common.policies , as a result it must use its own policy models (see DQN Policies). Available Policies ......

python - Using custom keras model with layer sharing together ...

I am trying to use a custom neural network with the DqnAgent() from tf. In my model I need to use layer sharing....

Models, Preprocessors, and Action Distributions — Ray 2.2.0

In case, no custom model is specified (see further below on how to customize models), RLlib will pick a default model based on...

Creating DQN Learning Agent without Gym environment for a ...

I've read it and tried creating my environment but I'm still stuck when it comes to how to model the custom environment as...

Build a DQN Reinforcement Learning Model

Let's first implement the deep learning neural net model f(s, θ) in TensorFlow. In TF2, eager execution is the default mode so we...