Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[rllib] Vectorization & multi-agent are broken in DDPG (both TF and Torch)

See original GitHub issue

What is the problem?

./train.py --run=DDPG --env=MountainCarContinuous-v0 --config='{"num_envs_per_worker": 2}' and ./train.py --run=DDPG --env=MountainCarContinuous-v0 --config='{"num_envs_per_worker": 2}' --torch

both currently crash. The issue seems to be that while the input to compute_actions() is a batch of N observations, only 1 action is returned as output. I ran into this while debugging a hang in a multi-agent test case (hung due to a bug in the env triggered by vectorization returning 1 action instead of N actions).

Issue Analytics

State:
Created 3 years ago
Comments:6 (6 by maintainers)

Top GitHub Comments

1reaction

ericlcommented, May 5, 2020

git bisect run script.sh

script:

cd python
sudo SKIP_THIRDPARTY_INSTALL=1 SKIP_PYARROW_INSTALL=1 python setup.py develop
cd -
cd rllib
./train.py --run=DDPG --env=MountainCarContinuous-v0 --config='{"num_envs_per_worker": 2}' --stop='{"timesteps_total": 1}'

83e06cd30a45245c2cb0e9f4bd924224b1581554 is the first bad commit
commit 83e06cd30a45245c2cb0e9f4bd924224b1581554
Author: Sven Mika <sven@anyscale.io>
Date:   Sun Mar 1 20:53:35 2020 +0100

    [RLlib] DDPG refactor and Exploration API action noise classes. (#7314)

0reactions

sven1977commented, May 5, 2020

Leaving this open until merged.

Top Results From Across the Web

Algorithms — Ray 1.13.0

[paper] [implementation] RLlib implements both A2C and A3C. These algorithms scale to 16-32+ worker processes depending on the environment.

Algorithms — Ray 2.2.0 - the Ray documentation

Instead, gradients are computed remotely on each rollout worker and all-reduced at each mini-batch using torch distributed. This allows each worker's GPU to...

Getting Started with RLlib — Ray 2.2.0 - the Ray documentation

In multi-agent training, the algorithm manages the querying and ... Policy- or the Algorithm's checkpoints also contain (tf or torch) native model files....

Examples — Ray 2.2.0

Example of how to setup an RLlib Algorithm against a locally running Unity3D editor instance to learn any Unity3D game (including support for...

rllib-training.rst.txt - the Ray documentation

In multi-agent training, the algorithm manages the querying and optimization of multiple ... if eager_tracing=True) # torch: PyTorch "framework": "tf", ...