Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Rllib] Using Dict state space throws exception (not supported yet?)

See original GitHub issue

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux 16.04
Ray installed from (source or binary): Source
Ray version:0.5.3
Python version: 3.6.6
Exact command to reproduce:

Describe the problem

I am trying to use Dict state space with PPO but it throws an exception. More details below.

Source code / logs

class MyServing(ServingEnv):
    def __init__(self):
        ServingEnv.__init__(
            self, spaces.Box(-1.0, 1.0, (1,), dtype=np.float32),
            spaces.Dict({"image":spaces.Box(0.0, 1.0, (img_height, img_width, 3), dtype=np.float32),
                         "speed": spaces.Box(0.0, 1.0, (1,), dtype=np.float32)}))
    def run(self):
        print("Starting policy server at {}:{}".format(SERVER_ADDRESS,
                                                       SERVER_PORT))
        server = PolicyServer(self, SERVER_ADDRESS, SERVER_PORT)
        server.serve_forever()
if __name__ == "__main__":
    register_my_model()
    ray.init(num_gpus=1)
    register_env("srv", lambda _: MyServing())

    # We use DQN since it supports off-policy actions, but you can choose and
    # configure any agent.
    ppo = PPOAgent(
        env="srv",
        config={
            # Use a single process to avoid needing to set up a load balancer
            "num_workers": 0,
            "num_gpus": 1,
            "batch_mode": "complete_episodes",
            "train_batch_size": 2000,
            "model": {
                "custom_model": "mymodel"
            },
            #"num_gpus":1,
            # Configure the agent to run short iterations for debugging
            #"exploration_fraction": 0.01,
            #"learning_starts": 100,
            #"timesteps_per_iteration": 200,
            #"schedule_max_timesteps": 100000,
            #"gamma": 0.8,
            "tf_session_args": {
                "gpu_options": {"allow_growth": True},
            },
        })

  File "/RL/ray-master/ray/python/ray/rllib/my_scripts/ppo/udacity_server_ppo.py", line 142, in <module>
    "gpu_options": {"allow_growth": True},
  File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 216, in __init__
    Trainable.__init__(self, config, logger_creator)
  File "/RL/ray-master/ray/python/ray/tune/trainable.py", line 86, in __init__
    self._setup()
  File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 258, in _setup
    self._init()
  File "/RL/ray-master/ray/python/ray/rllib/agents/ppo/ppo.py", line 85, in _init
    self.env_creator, self._policy_graph)
  File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 131, in make_local_evaluator
    "inter_op_parallelism_threads": None,
  File "/RL/ray-master/ray/python/ray/rllib/agents/agent.py", line 171, in _make_evaluator
    monitor_path=self.logdir if config["monitor"] else None)
  File "/RL/ray-master/ray/python/ray/rllib/evaluation/policy_evaluator.py", line 228, in __init__
    policy_dict, policy_config)
  File "/RL/ray-master/ray/python/ray/rllib/evaluation/policy_evaluator.py", line 286, in _build_policy_map
    policy_map[name] = cls(obs_space, act_space, merged_conf)
  File "/RL/ray-master/ray/python/ray/rllib/agents/ppo/ppo_policy_graph.py", line 123, in __init__
    shape=(None, ) + observation_space.shape)
TypeError: can only concatenate tuple (not "NoneType") to tuple

Issue Analytics

State:
Created 5 years ago
Reactions:1
Comments:15 (7 by maintainers)

Top GitHub Comments

1reaction

s-udhayacommented, Oct 15, 2018

@ericl I can confirm that, the dict state space works now. Thanks for the fix.

0reactions

colllincommented, Aug 28, 2019

Sorry for false alarm 🤦‍♂. I commented out my model and the env does seem to run with ray’s default model. It’s probably something I’m doing (or not doing).