Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ray tune error with multiagent policy graph

See original GitHub issue

My error happens in the run_experiments(), specifically

ray.tune.error.TuneError: ('Trials did not complete', [PPO_WaveAttenuationPOEnv-v0_0_lr=1e-05])
Closing connection to TraCI and stopping simulation.

My code is based on the mutliagent example “multiagent_stabilizing_the_ring” project. Bascially, I want to try multiple RL CAVs on the same ring road with a shared PPO policy. Please let me know if I understood something wrong. Different with that example, I set

env_name="WaveAttenuationPOEnv",
         scenario="LoopScenario",

policy graph part was kept the same:

def gen_policy():
        return (PPOPolicyGraph, obs_space, act_space, {})

    # Setup PG with an ensemble of `num_policies` different policy graphs
    policy_graphs = {'av': gen_policy()}

    def policy_mapping_fn(_):
        return 'av'

    config.update({
        'multiagent': {
            'policy_graphs': policy_graphs,
            'policy_mapping_fn': tune.function(policy_mapping_fn),
            'policies_to_train': ['av']
        }
    })

Additionally, it’s still not clear for me that if the shared policy is defined on the single-agent states and actions or joint states and actions. Any help would be truly appreciated!!

Issue Analytics

State:
Created 4 years ago
Comments:9 (5 by maintainers)

Top GitHub Comments

2reactions

eugenevinitskycommented, Mar 28, 2019

That is correct; batches are collected from both RL vehicles and used for the training.

1reaction

eugenevinitskycommented, Mar 24, 2019

Ah, glad that helped. So, for each policy graph you can just construct your desired action space as is currently done in the action_space and observation_space methods in the MultiWaveAttenuationEnv. For example, if you want to control two accelerations at once, you might make the action space Box(low=min_accel, high=max_accel, shape=(2,)) which will tell the policy graph to have two values as the output of the neural network.

Top Results From Across the Web

Tune doesn't work with multi agent env · Issue #3785 · ray- ...

The issue is that tune is trying to expand lambda functions to generate trial variants. To fix that, you can 'escape' the policy...

How To Customize Policies — Ray 2.2.0

Policy classes encapsulate the core numerical components of RL algorithms. This typically includes the policy model that determines actions to take, a ...

Policies — Ray 2.2.0

The Policy class contains functionality to compute actions for decision making in an environment, as well as computing loss(es) and gradients, updating a...

Examples — Ray 2.2.0

This blog post is a brief tutorial on multi-agent RL and its design in RLlib. ... This script offers a simple workflow for...

Environments — Ray 2.2.0

Here we plot just the throughput of RLlib policy evaluation from 1 to 128 CPUs. ... This API allows you to implement any...