question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ray tune error with multiagent policy graph

See original GitHub issue

My error happens in the run_experiments(), specifically

ray.tune.error.TuneError: ('Trials did not complete', [PPO_WaveAttenuationPOEnv-v0_0_lr=1e-05])
Closing connection to TraCI and stopping simulation.

My code is based on the mutliagent example “multiagent_stabilizing_the_ring” project. Bascially, I want to try multiple RL CAVs on the same ring road with a shared PPO policy. Please let me know if I understood something wrong. Different with that example, I set

env_name="WaveAttenuationPOEnv",
         scenario="LoopScenario", 

policy graph part was kept the same:

def gen_policy():
        return (PPOPolicyGraph, obs_space, act_space, {})

    # Setup PG with an ensemble of `num_policies` different policy graphs
    policy_graphs = {'av': gen_policy()}

    def policy_mapping_fn(_):
        return 'av'

    config.update({
        'multiagent': {
            'policy_graphs': policy_graphs,
            'policy_mapping_fn': tune.function(policy_mapping_fn),
            'policies_to_train': ['av']
        }
    })

Additionally, it’s still not clear for me that if the shared policy is defined on the single-agent states and actions or joint states and actions. Any help would be truly appreciated!!

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

2reactions
eugenevinitskycommented, Mar 28, 2019

That is correct; batches are collected from both RL vehicles and used for the training.

1reaction
eugenevinitskycommented, Mar 24, 2019

Ah, glad that helped. So, for each policy graph you can just construct your desired action space as is currently done in the action_space and observation_space methods in the MultiWaveAttenuationEnv. For example, if you want to control two accelerations at once, you might make the action space Box(low=min_accel, high=max_accel, shape=(2,)) which will tell the policy graph to have two values as the output of the neural network.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Tune doesn't work with multi agent env · Issue #3785 · ray- ...
The issue is that tune is trying to expand lambda functions to generate trial variants. To fix that, you can 'escape' the policy...
Read more >
How To Customize Policies — Ray 2.2.0
Policy classes encapsulate the core numerical components of RL algorithms. This typically includes the policy model that determines actions to take, a ...
Read more >
Policies — Ray 2.2.0
The Policy class contains functionality to compute actions for decision making in an environment, as well as computing loss(es) and gradients, updating a...
Read more >
Examples — Ray 2.2.0
This blog post is a brief tutorial on multi-agent RL and its design in RLlib. ... This script offers a simple workflow for...
Read more >
Environments — Ray 2.2.0
Here we plot just the throughput of RLlib policy evaluation from 1 to 128 CPUs. ... This API allows you to implement any...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found