[RLlib] Support for nested container action spaces
See original GitHub issueGym Platform is an environment used as a benchmark in many academic papers for RL algorithms supporting hybrid action spaces (discrete and continuous).
The action space generated by this environment is a nested gym Tuple, this is common in many environments that have hybrid action space:
Tuple(Discrete(3), Tuple(Box(1,), Box(1,), Box(1,)))
When Rllib tries to initialise itself, it fails when creating the action placeholders in ray/rllib/models/catalog.py when calling get_action_shape and there is no way to customise this without re-patching the library.
You could reproduce the behaviour with the code below:
import ray
from ray import tune
import gym
import gym_platform
from ray.tune.registry import register_env
class Platform(gym.Env):
def __init__(self, env_config):
self.env = gym.make("gym_platform:Platform-v0")
self.action_space = self.env.action_space
self.observation_space = self.env.observation_space
def reset(self):
return self.env.reset()
def step(self, action):
return self.env.step(action)
register_env("platform", lambda config: Platform(config))
ray.init()
tune.run(
"A3C",
stop={"training_iteration": 10},
config={
"env": "platform",
"num_workers": 1,
},
)
Issue Analytics
- State:
- Created 4 years ago
- Reactions:1
- Comments:6 (6 by maintainers)
Top Results From Across the Web
Key Concepts — Ray 2.2.0
An RLlib environment consists of: all possible actions (action space). a complete description of the environment, nothing hidden (state space).
Read more >Models, Preprocessors, and Action Distributions — Ray 2.2.0
For Atari observation spaces, RLlib defaults to using the DeepMind preprocessors ... SampleCollectors: Have to store possibly nested action structs.
Read more >Environments — Ray 2.2.0
A call to BaseEnv:poll() returns observations from ready agents keyed by 1) their environment, then 2) agent ids. Actions for those agents are...
Read more >Getting Started with RLlib — Ray 3.0.0.dev0
The rllib train command (same as the train.py script in the repo) has a number of options you can show by running rllib...
Read more >Key Concepts — Ray 1.12.1
In RLlib you use trainers to train algorithms . These algorithms use policies to select actions for your agents. Given a policy, evaluation...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Yeah this would be great to have, since non-nested ones work already. Might be good to also support Dict action spaces (which is basically the same as Tuple but with names for the indices).
PR (https://github.com/ray-project/ray/pull/8019) is out and will be merged in the next few days. I’m closing this issue. It contains an example learning (PPO) script in
rllib/examples/nested_action_spaces.py
.