[Question] [rllib] Not all agents may be present in the dict in each time step
See original GitHub issueWhat is your question?
I am attempting to create a stochastic game environment where the agents take turns. Each agent’s turn modifies the state that the other agent would see. The multiagent docs seem to indicate that some agents may not be present in each time step, and that only those agents who are present will supply an action. I’m trying to leverage this in my step function, something like this:
if player_1 action exists:
new_state = process_player1_action()
obs = {player2: new_state}
if player_2 action exists:
new_state = process_player2_action()
obs = {player1: new_state}
When I do this, I get the following error messages:
Traceback (most recent call last):
File "estimation_sg.py", line 178, in <module>
result = trainer.train()
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 443, in train
raise e
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 432, in train
result = Trainable.train(self)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/tune/trainable.py", line 254, in train
result = self._train()
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 125, in _train
fetches = self.optimizer.step()
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/optimizers/multi_gpu_optimizer.py", line 136, in step
self.num_envs_per_worker, self.train_batch_size)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/optimizers/rollout.py", line 25, in collect_samples
next_sample = ray_get_and_free(fut_sample)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/utils/memory.py", line 29, in ray_get_and_free
result = ray.get(object_ids)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/worker.py", line 1492, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray_worker (pid=6328, ip=10.247.242.141)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/gym/spaces/box.py", line 104, in contains
return x.shape == self.shape and np.all(x >= self.low) and np.all(x <= self.high)
AttributeError: 'NoneType' object has no attribute 'shape'
During handling of the above exception, another exception occurred:
ray_worker (pid=6328, ip=10.247.242.141)
File "python/ray/_raylet.pyx", line 637, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 638, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 643, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 623, in function_executor
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 475, in sample
batches = [self.input_reader.next()]
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 52, in next
batches = [self.get_data()]
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 95, in get_data
item = next(self.rollout_provider)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 315, in _env_runner
soft_horizon, no_done_at_end)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 403, in _process_observations
policy_id).transform(raw_obs)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 162, in transform
self.check_shape(observation)
File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 65, in check_shape
"should be an np.array, not a Python list.", observation)
ValueError: ('Observation for a Box/MultiBinary/MultiDiscrete space should be an np.array, not a Python list.', None)
Am I misunderstanding what the docs seem to indicate about agents being present in each obs?
python 3.7 tf 2.1 ray 0.8.1 mac 10.14
Issue Analytics
- State:
- Created 4 years ago
- Comments:6 (1 by maintainers)
Top Results From Across the Web
How should you end a MultiAgentEnv episode? - RLlib - Ray
At each step t , I return obs, rew, done, info , where obs contains ... that the agents can have different observation...
Read more >Different step space for different agents - RLlib - Ray.io
But I am wondering if it is possible to let different agents have a different step space. For example, the agent 1 takes...
Read more >Getting Started with RLlib — Ray 2.2.0 - the Ray documentation
At a high level, RLlib provides you with an Algorithm class which holds a policy for environment interaction. Through the algorithm's interface, you...
Read more >Environments — Ray 2.2.0
Only those # agents' names that require actions in the next call to `step()` should # be present in the returned observation dict...
Read more >Sample Collections and Trajectory Views — Ray 2.2.0
It does not matter, how many individual agents are stepping ... in this very call (not all existing agents in the environment may...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I ran into this issue when using ray tune. The problem was that I was not returning anything from my custom gym.Env reset() function. It needs to return the initial observation
I’m sorry @caesar025, I don’t remember what I did to fix this. I can say though that in this case the gym error is misleading. The true issue was this NoneType error.
I have been working extensively with MultiAgentEnv since then and have gained some good experience about what to expect with connecting with RLlib. If you want to share your code, I can take a look and help debug a bit.