question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

DQN does not allow custom models

See original GitHub issue

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
  • Ray installed from (source or binary): source
  • Ray version: 0.8.0.dev6
  • Python version: 3.7.5
  • Exact command to reproduce:

The following code tries to set built-in VisionNetwork from TF as custom model and it errors out as described below. However, the code succeeds if custom model was not set in which case exact same VisionNetwork gets selected automatically by _get_v2_model. The cause of this issue is explained below however I’m not sure about the fix.

import ray
from ray.rllib.agents.dqn import DQNTrainer
from ray.rllib.models import ModelCatalog
from ray.rllib.models.tf.visionnet_v2 import VisionNetwork

ModelCatalog.register_custom_model("my_model", VisionNetwork)

config = {'model': {
            "custom_model": "my_model",
            "custom_options": {},  # extra options to pass to your model
        }}
ray.init()

agent = DQNTrainer(config=config, env="BreakoutNoFrameskip-v4")

Describe the problem

Current code in master is not allowing the use of custom models in DQN. When trying to use custom model (either for TF or PyTorch), error is thrown indicating that model has not been subclassed from DistributionalQModel. This happens even when custom model is set to simply ray.rllib.models.tf.visionnet_v2.VisionNetwork.

Error message:

'The given model must subclass', <class 'ray.rllib.agents.dqn.distributional_q_model.DistributionalQModel'>)

Source code / logs

Cause of this issue is this check. Notice that this check is only done if custom_model is set. Apparently built-in models don’t subclass DistributionalQModel either however as this check is not applied to built-in models they work fine.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:6 (4 by maintainers)

github_iconTop GitHub Comments

1reaction
arunavo4commented, Nov 9, 2019

@sytelus Hey I ran into this exact issue a few days back and all I subclassed the right Model and everything works as expected. Copy-paste this Model code below.

`

# ============== VisionNetwork Model ==================

class VisionNetwork(DistributionalQModel):
    """Generic vision network implemented in DistributionalQModel API."""

    def __init__(self, obs_space, action_space, num_outputs, model_config,
                 name, **kw):
        super(VisionNetwork, self).__init__(
            obs_space, action_space, num_outputs, model_config, name, **kw)

        activation = get_activation_fn(model_config.get("conv_activation"))
        filters = model_config.get("conv_filters")
        if not filters:
            filters = _get_filter_config(obs_space.shape)
        no_final_linear = model_config.get("no_final_linear")
        vf_share_layers = model_config.get("vf_share_layers")

        inputs = tf.keras.layers.Input(
            shape=obs_space.shape, name="observations")
        last_layer = inputs

        # Build the action layers
        for i, (out_size, kernel, stride) in enumerate(filters[:-1], 1):
            last_layer = tf.keras.layers.Conv2D(
                out_size,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="same",
                name="conv{}".format(i))(last_layer)
        out_size, kernel, stride = filters[-1]
        if no_final_linear:
            # the last layer is adjusted to be of size num_outputs
            last_layer = tf.keras.layers.Conv2D(
                num_outputs,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="valid",
                name="conv_out")(last_layer)
            conv_out = last_layer
        else:
            last_layer = tf.keras.layers.Conv2D(
                out_size,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="valid",
                name="conv{}".format(i + 1))(last_layer)
            conv_out = tf.keras.layers.Conv2D(
                num_outputs, [1, 1],
                activation=None,
                padding="same",
                name="conv_out")(last_layer)

        # Build the value layers
        if vf_share_layers:
            last_layer = tf.keras.layers.Lambda(
                lambda x: tf.squeeze(x, axis=[1, 2]))(last_layer)
            value_out = tf.keras.layers.Dense(
                1,
                name="value_out",
                activation=None,
                kernel_initializer=normc_initializer(0.01))(last_layer)
        else:
            # build a parallel set of hidden layers for the value net
            last_layer = inputs
            for i, (out_size, kernel, stride) in enumerate(filters[:-1], 1):
                last_layer = tf.keras.layers.Conv2D(
                    out_size,
                    kernel,
                    strides=(stride, stride),
                    activation=activation,
                    padding="same",
                    name="conv_value_{}".format(i))(last_layer)
            out_size, kernel, stride = filters[-1]
            last_layer = tf.keras.layers.Conv2D(
                out_size,
                kernel,
                strides=(stride, stride),
                activation=activation,
                padding="valid",
                name="conv_value_{}".format(i + 1))(last_layer)
            last_layer = tf.keras.layers.Conv2D(
                1, [1, 1],
                activation=None,
                padding="same",
                name="conv_value_out")(last_layer)
            value_out = tf.keras.layers.Lambda(
                lambda x: tf.squeeze(x, axis=[1, 2]))(last_layer)

        self.base_model = tf.keras.Model(inputs, [conv_out, value_out])
        self.register_variables(self.base_model.variables)

    def forward(self, input_dict, state, seq_lens):
        # explicit cast to float32 needed in eager
        model_out, self._value_out = self.base_model(
            tf.cast(input_dict["obs"], tf.float32))
        return tf.squeeze(model_out, axis=[1, 2]), state

    def value_function(self):
        return tf.reshape(self._value_out, [-1])


# ================== Register Custom Model ======================
ModelCatalog.register_custom_model("NatureCNN", VisionNetwork)`
0reactions
AmeerHajAlicommented, Nov 9, 2019

That sounds good!

Read more comments on GitHub >

github_iconTop Results From Across the Web

DQN — Stable Baselines 2.10.3a0 documentation
The DQN model does not support stable_baselines.common.policies , as a result it must use its own policy models (see DQN Policies). Available Policies ......
Read more >
python - Using custom keras model with layer sharing together ...
I am trying to use a custom neural network with the DqnAgent() from tf. In my model I need to use layer sharing....
Read more >
Models, Preprocessors, and Action Distributions — Ray 2.2.0
In case, no custom model is specified (see further below on how to customize models), RLlib will pick a default model based on...
Read more >
Creating DQN Learning Agent without Gym environment for a ...
I've read it and tried creating my environment but I'm still stuck when it comes to how to model the custom environment as...
Read more >
Build a DQN Reinforcement Learning Model
Let's first implement the deep learning neural net model f(s, θ) in TensorFlow. In TF2, eager execution is the default mode so we...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found