Action Discrete(5) and reward in "simple_tag" env
See original GitHub issue-
The action for each agent is Discrete(5). However actually it is Box(5) within (-1, 1). The code here
agent.action.u[0] += action[0][1] - action[0][2] agent.action.u[1] += action[0][3] - action[0][4]
is used to getp_force
and then to getp_vel
, So what does action[0][0] do? -
The reward of adversary agents for each step is based on
is_collision
which turns out to be the same reward for each adversary agent even if we consider the penalty in the caseshape = True
. How is it different fromself.shared_reward = True
inenvironment.py
?
I don’t mean to complain, just wonder how it works. Appreciate it if you guys could answer me.
Issue Analytics
- State:
- Created 6 years ago
- Reactions:1
- Comments:5 (1 by maintainers)
Top Results From Across the Web
Environment Detail - 及第 Jidi
The reward for agent is the sum of scaled distances to the adversary and subtracts 10 if colliding with the adversary. The reward...
Read more >Simple Tag - PettingZoo Documentation
This is a predator-prey environment. Good agents (green) are faster and receive a negative reward for being hit by adversaries (red) (-10 for...
Read more >Code for a multi-agent particle environment used in ...
A simple multi-agent particle world with a continuous observation and discrete action space, along with some basic simulated physics. Used in the paper...
Read more >Multi-Agent Reinforcement Learning: OpenAI's MADDPG
Adversary is rewarded based on how close it is to the target, but it doesn't know which landmark is the target landmark. So...
Read more >FACMAC: Factored Multi-Agent Centralised Policy Gradients
FACMAC outperforms MADDPG and other baselines in both discrete and continuous action tasks. Figure 5 and 6 illustrate the mean episode return ...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
@NorthernWolf, I’m not a maintainer/author but I was playing around with it this morning and I think I have a simple example that you can use to give all agents in the environment a random action for any of these environments, just replace
make_env('simple_push')
with the name of the scenario you want to watch:Hope it helps!
Agree. If you OpenAI guys can release a simple example of random agents in all environment, then it will be a great relief. Hope there will be a explanation of the action space and how to take action in different environments, since it’s quiet confusing. Thank you.