Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Question] [rllib] Not all agents may be present in the dict in each time step

See original GitHub issue

What is your question?

I am attempting to create a stochastic game environment where the agents take turns. Each agent’s turn modifies the state that the other agent would see. The multiagent docs seem to indicate that some agents may not be present in each time step, and that only those agents who are present will supply an action. I’m trying to leverage this in my step function, something like this:

if player_1 action exists:
    new_state = process_player1_action()
    obs = {player2: new_state}
if player_2 action exists:
    new_state = process_player2_action()
    obs = {player1: new_state}

When I do this, I get the following error messages:

Traceback (most recent call last):
  File "estimation_sg.py", line 178, in <module>
    result = trainer.train()
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 443, in train
    raise e
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer.py", line 432, in train
    result = Trainable.train(self)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/tune/trainable.py", line 254, in train
    result = self._train()
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/agents/trainer_template.py", line 125, in _train
    fetches = self.optimizer.step()
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/optimizers/multi_gpu_optimizer.py", line 136, in step
    self.num_envs_per_worker, self.train_batch_size)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/optimizers/rollout.py", line 25, in collect_samples
    next_sample = ray_get_and_free(fut_sample)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/utils/memory.py", line 29, in ray_get_and_free
    result = ray.get(object_ids)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/worker.py", line 1492, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ValueError): ray_worker (pid=6328, ip=10.247.242.141)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/gym/spaces/box.py", line 104, in contains
    return x.shape == self.shape and np.all(x >= self.low) and np.all(x <= self.high)
AttributeError: 'NoneType' object has no attribute 'shape'

During handling of the above exception, another exception occurred:

ray_worker (pid=6328, ip=10.247.242.141)
  File "python/ray/_raylet.pyx", line 637, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 638, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 643, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 623, in function_executor
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/rollout_worker.py", line 475, in sample
    batches = [self.input_reader.next()]
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 52, in next
    batches = [self.get_data()]
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 95, in get_data
    item = next(self.rollout_provider)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 315, in _env_runner
    soft_horizon, no_done_at_end)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/evaluation/sampler.py", line 403, in _process_observations
    policy_id).transform(raw_obs)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 162, in transform
    self.check_shape(observation)
  File "/Users/rusu1/.python-virtual-env/rllib/lib/python3.7/site-packages/ray/rllib/models/preprocessors.py", line 65, in check_shape
    "should be an np.array, not a Python list.", observation)
ValueError: ('Observation for a Box/MultiBinary/MultiDiscrete space should be an np.array, not a Python list.', None)

Am I misunderstanding what the docs seem to indicate about agents being present in each obs?

python 3.7 tf 2.1 ray 0.8.1 mac 10.14

Issue Analytics

State:
Created 4 years ago
Comments:6 (1 by maintainers)

Top GitHub Comments

3reactions

Marcus-Anestadcommented, Apr 15, 2021

I ran into this issue when using ray tune. The problem was that I was not returning anything from my custom gym.Env reset() function. It needs to return the initial observation

0reactions

rusu24edwardcommented, Aug 20, 2020

I’m sorry @caesar025, I don’t remember what I did to fix this. I can say though that in this case the gym error is misleading. The true issue was this NoneType error.

I have been working extensively with MultiAgentEnv since then and have gained some good experience about what to expect with connecting with RLlib. If you want to share your code, I can take a look and help debug a bit.

Top Results From Across the Web

How should you end a MultiAgentEnv episode? - RLlib - Ray

At each step t , I return obs, rew, done, info , where obs contains ... that the agents can have different observation...

Different step space for different agents - RLlib - Ray.io

But I am wondering if it is possible to let different agents have a different step space. For example, the agent 1 takes...

Getting Started with RLlib — Ray 2.2.0 - the Ray documentation

At a high level, RLlib provides you with an Algorithm class which holds a policy for environment interaction. Through the algorithm's interface, you...

Environments — Ray 2.2.0

Only those # agents' names that require actions in the next call to `step()` should # be present in the returned observation dict...

Sample Collections and Trajectory Views — Ray 2.2.0

It does not matter, how many individual agents are stepping ... in this very call (not all existing agents in the environment may...