Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Skipping observation in multi agent env

See original GitHub issue

Describe your feature request

I am working on an implementation of the warewolf game using the rllib wrapper for gym multi agent envs. In this game there are wolves and villagers.

The game is divided into night and day phase. During day every agent can perform an action while during night only wolves can. Precisely, night observation should not be visible to villager agents. I have an observation which specify the current phase and would like to filter out night observation for the latter case. Is there a way to implement it easily?

What have I tried

I tried modifying the _process_observations function adding a line after line 403. Using a custom Preprocessor I am able to return None if the current observation should be discarded (given an agent id). Then if the processed observation is none just skip the step with:

 if prep_obs is None:
                continue

I don’t know if this implementation if conceptually correct or if there is another way to do it. Please let me know.

Edit 1

Applying the previous method yields: {ValueError}The environment terminated for all agents, but we still don't have a last observation for agent villager_2 (policy vill_p). Please ensure that you include the last observations of all live agents when setting '__all__' done to True. Alternatively, set no_done_at_end=True to allow this. In here.

Issue Analytics

State:
Created 4 years ago
Comments:7 (3 by maintainers)

Top GitHub Comments

1reaction

nicofirst1commented, Jan 11, 2020

Moreover the second solution seems to work for the issue so we could consider the issue closed

0reactions

nicofirst1commented, Jan 11, 2020

Sorry for the late reply, I manage to solve the problem by running

   analysis = tune.run(
        "PG",
        local_dir=Params.RAY_DIR,
        config=configs,
        trial_name_creator=trial_name_creator,

    )

Rather then :

trainer = PGTrainer(configs, PolicyWw)
for i in tqdm(range(20)):
    trainer.train()

Top Results From Across the Web

Question - In multi-agent environment, agents don't get reward ...

It means I can't let an agent to add reward and observation at right time so the information to be added is not...

Skip Training for Multi-Agent Reinforcement Learning ... - arXiv

This feedback helps the RL controller assess the effects of its action based on inputs from the environment to take further actions based...

Blog - Multi-Agent Learning Environments

This blog post provides an overview of a range of multi-agent reinforcement learning (MARL) environments with their main properties and ...

ray.rllib.env.multi_agent_env — Ray 2.2.0

Env ): """An environment that hosts multiple independent agents. ... Returns: Tuple containing 1) new observations for each ready agent, 2) reward values...

Train Reinforcement Learning Agents - MATLAB & Simulink

Training Algorithm · Apply action a to the environment and obtain the next observation s''and the reward r. · Learn from the experience...