[Bug Report] `MuJoCo.Ant` contact forces being off by default is based on a wrong experiment
See original GitHub issueDescribe the bug
The problem
due to https://github.com/openai/gym/pull/2762#issuecomment-1135362092 it was decided that use_contact_forces
would default to False
, but
The 2 different problem factorizations, used DIFFERENT REWARD FUNCTIONS
As you can see here, the reward functions are indeed different: https://github.com/rodrigodelazcano/gym/blob/9c9741498dd0b613fb2d418f17d77ab5f6e60476/gym/envs/mujoco/ant_v4.py#L264
This behavior (of differing rewards functions) is also not documented at all (i can make a PR for that)
Code at that commit: (it is same as the current code, as far we are concerned, with our current problem) https://github.com/rodrigodelazcano/gym/blob/9c9741498dd0b613fb2d418f17d77ab5f6e60476/gym/envs/mujoco/ant_v4.py
Code example
No response
System info
No response
Additional context
No response
Checklist
- I have checked that there is no similar issue in the repo
Issue Analytics
- State:
- Created 9 months ago
- Comments:6 (6 by maintainers)
Top Results From Across the Web
Contact forces are zero in Ant-v2/v3 · Issue #1541 · openai/gym
The contact forces are all zero in the the MuJoCo Ant-v2/v3 environments. If one runs import gym import numpy as np e =...
Read more >arXiv:2211.03413v1 [cs.LG] 7 Nov 2022
Experiments in multi-joint dynamics with contact. (MuJoCo) environments show that the proposed method exhibited a worst-case.
Read more >MuJoCo Documentation: Overview
MuJoCo stands for Multi-Joint dynamics with Contact. It is a general purpose physics engine that aims to facilitate research and development in robotics, ......
Read more >Contact Forces: Examples & Definition - StudySmarter
The force that was exerted on your face was the result of the contact of someone's hand with your face. However, there is...
Read more >Algorithms — Ray 2.2.0 - the Ray documentation
Instead, gradients are computed remotely on each rollout worker and all-reduced at each mini-batch using torch distributed. This allows each worker's GPU to...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks both of you, my proposed answer would be to add a note on the ant documentation to note this difference between v2/3 and v4 (with default parameters) and using
use_contact_forces
for equivalence between mujoco-py and mujoco bindings. Additionally, once the experiments are completed, then we need to add a comment to the original PR which we can link in the documentation. This should avoid the need for a v5 environment but if users read the documentation they can understand the difference between versions, etc@Kallinteris-Andreas yes, you are right. We missed taking into account the fact that adding/removing
contact_cost
to the reward function would affect the return differently in the long run. The initial justification of having a performance degradation due to contact forces in the observations is wrong, sorry about that. I’ll run this weekend the suggested experiments to make sure that’s the case.The past versions of the environment (v2/v3) are being kept for reproducibility of past research. We haven’t modified anything and they can still be used with
mujoco_py
and older versions of mujoco.As you’ve mentioned v4 environments upgrade to use the latest mujoco bindings instead of
mujoco-py
since it’s no longer maintained. In the process we also decided to fix the external contact forces issue you’ve mentioned that appeared with later mujoco versionsmujoco>=2.0
. Thus the reward function of v4 is no different from v2/v3 if external contact forces are included in observation and reward, withuse_contact_forces
.However, because we observed successful learning without contact forces in https://github.com/openai/gym/pull/2762#issuecomment-1135362092, we decided to make the use of external forces (observation/reward) optional. Having said this I don’t think there is a need for a v5 version other than your documentation updates mentioning that using contact forces will affect the reward function, which is highly important and I thank you for finding this out. What do you think @pseudo-rnd-thoughts?