Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Bug Report] `MuJoCo.Ant` contact forces being off by default is based on a wrong experiment

See original GitHub issue

Describe the bug

The problem

due to https://github.com/openai/gym/pull/2762#issuecomment-1135362092 it was decided that use_contact_forces would default to False, but The 2 different problem factorizations, used DIFFERENT REWARD FUNCTIONS

As you can see here, the reward functions are indeed different: https://github.com/rodrigodelazcano/gym/blob/9c9741498dd0b613fb2d418f17d77ab5f6e60476/gym/envs/mujoco/ant_v4.py#L264

This behavior (of differing rewards functions) is also not documented at all (i can make a PR for that)

@rodrigodelazcano

Code at that commit: (it is same as the current code, as far we are concerned, with our current problem) https://github.com/rodrigodelazcano/gym/blob/9c9741498dd0b613fb2d418f17d77ab5f6e60476/gym/envs/mujoco/ant_v4.py

Code example

No response

System info

No response

Additional context

No response

Checklist

I have checked that there is no similar issue in the repo

Issue Analytics

State:
Created 9 months ago
Comments:6 (6 by maintainers)

Top GitHub Comments

2reactions

pseudo-rnd-thoughtscommented, Dec 16, 2022

Thanks both of you, my proposed answer would be to add a note on the ant documentation to note this difference between v2/3 and v4 (with default parameters) and using use_contact_forces for equivalence between mujoco-py and mujoco bindings. Additionally, once the experiments are completed, then we need to add a comment to the original PR which we can link in the documentation. This should avoid the need for a v5 environment but if users read the documentation they can understand the difference between versions, etc

0reactions

rodrigodelazcanocommented, Dec 15, 2022

@Kallinteris-Andreas yes, you are right. We missed taking into account the fact that adding/removing contact_cost to the reward function would affect the return differently in the long run. The initial justification of having a performance degradation due to contact forces in the observations is wrong, sorry about that. I’ll run this weekend the suggested experiments to make sure that’s the case.

The past versions of the environment (v2/v3) are being kept for reproducibility of past research. We haven’t modified anything and they can still be used with mujoco_py and older versions of mujoco.

As you’ve mentioned v4 environments upgrade to use the latest mujoco bindings instead of mujoco-py since it’s no longer maintained. In the process we also decided to fix the external contact forces issue you’ve mentioned that appeared with later mujoco versions mujoco>=2.0. Thus the reward function of v4 is no different from v2/v3 if external contact forces are included in observation and reward, with use_contact_forces.

However, because we observed successful learning without contact forces in https://github.com/openai/gym/pull/2762#issuecomment-1135362092, we decided to make the use of external forces (observation/reward) optional. Having said this I don’t think there is a need for a v5 version other than your documentation updates mentioning that using contact forces will affect the reward function, which is highly important and I thank you for finding this out. What do you think @pseudo-rnd-thoughts?

Top Results From Across the Web

Contact forces are zero in Ant-v2/v3 · Issue #1541 · openai/gym

The contact forces are all zero in the the MuJoCo Ant-v2/v3 environments. If one runs import gym import numpy as np e =...

arXiv:2211.03413v1 [cs.LG] 7 Nov 2022

Experiments in multi-joint dynamics with contact. (MuJoCo) environments show that the proposed method exhibited a worst-case.

MuJoCo Documentation: Overview

MuJoCo stands for Multi-Joint dynamics with Contact. It is a general purpose physics engine that aims to facilitate research and development in robotics, ......

Contact Forces: Examples & Definition - StudySmarter

The force that was exerted on your face was the result of the contact of someone's hand with your face. However, there is...

Algorithms — Ray 2.2.0 - the Ray documentation

Instead, gradients are computed remotely on each rollout worker and all-reduced at each mini-batch using torch distributed. This allows each worker's GPU to...