question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Action Discrete(5) and reward in "simple_tag" env

See original GitHub issue
  1. The action for each agent is Discrete(5). However actually it is Box(5) within (-1, 1). The code here agent.action.u[0] += action[0][1] - action[0][2] agent.action.u[1] += action[0][3] - action[0][4] is used to get p_force and then to get p_vel, So what does action[0][0] do?

  2. The reward of adversary agents for each step is based on is_collision which turns out to be the same reward for each adversary agent even if we consider the penalty in the case shape = True. How is it different from self.shared_reward = True in environment.py?

I don’t mean to complain, just wonder how it works. Appreciate it if you guys could answer me.

Issue Analytics

  • State:closed
  • Created 6 years ago
  • Reactions:1
  • Comments:5 (1 by maintainers)

github_iconTop GitHub Comments

7reactions
tebba-von-mathensteincommented, Jan 15, 2018

@NorthernWolf, I’m not a maintainer/author but I was playing around with it this morning and I think I have a simple example that you can use to give all agents in the environment a random action for any of these environments, just replace make_env('simple_push') with the name of the scenario you want to watch:

from make_env import make_env
import numpy as np

env = make_env('simple_push')

for i_episode in range(20):
    observation = env.reset()
    for t in range(100):
        env.render()
        agent_actions = []
        for i, agent in enumerate(env.world.agents):
            # This is a Discrete
            # https://github.com/openai/gym/blob/master/gym/spaces/discrete.py
            agent_action_space = env.action_space[i]

            # Sample returns an int from 0 to agent_action_space.n
            action = agent_action_space.sample()

            # Environment expects a vector with length == agent_action_space.n
            # containing 0 or 1 for each action, 1 meaning take this action
            action_vec = np.zeros(agent_action_space.n)
            action_vec[action] = 1
            agent_actions.append(action_vec)

        # Each of these is a vector parallel to env.world.agents, as is agent_actions
        observation, reward, done, info = env.step(agent_actions)
        print (observation)
        print (reward)
        print (done)
        print (info)
        print()

Hope it helps!

3reactions
Haoxiang-Wangcommented, Nov 13, 2017

Agree. If you OpenAI guys can release a simple example of random agents in all environment, then it will be a great relief. Hope there will be a explanation of the action space and how to take action in different environments, since it’s quiet confusing. Thank you.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Environment Detail - 及第 Jidi
The reward for agent is the sum of scaled distances to the adversary and subtracts 10 if colliding with the adversary. The reward...
Read more >
Simple Tag - PettingZoo Documentation
This is a predator-prey environment. Good agents (green) are faster and receive a negative reward for being hit by adversaries (red) (-10 for...
Read more >
Code for a multi-agent particle environment used in ...
A simple multi-agent particle world with a continuous observation and discrete action space, along with some basic simulated physics. Used in the paper...
Read more >
Multi-Agent Reinforcement Learning: OpenAI's MADDPG
Adversary is rewarded based on how close it is to the target, but it doesn't know which landmark is the target landmark. So...
Read more >
FACMAC: Factored Multi-Agent Centralised Policy Gradients
FACMAC outperforms MADDPG and other baselines in both discrete and continuous action tasks. Figure 5 and 6 illustrate the mean episode return ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found