Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[rllib] AttributeError: 'PPO' object has no attribute 'workers'

See original GitHub issue

What is the problem?

Ray version: 0.8.2 (also found on 0.8.1), EC2 ami: ami-030c12d850a83dff1 (linux)

Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 459, in _process_trial
    result = self.trial_executor.fetch_result(trial)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 377, in fetch_result
    result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/worker.py", line 1504, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train() (pid=3139, ip=172.31.41.218)
  File "python/ray/_raylet.pyx", line 452, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 430, in ray._raylet.execute_task.function_executor
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 494, in train
    raise e
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 483, in train
    result = Trainable.train(self)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/tune/trainable.py", line 316, in train
    self._log_result(result)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 534, in _log_result
    "result": result,
  File "/home/ubuntu/cmbrl/run/ray_bug.py", line 45, in on_train_result
    outputs = trainer.workers.foreach_worker(lambda ev: ev.foreach_env(
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py", line 112, in foreach_worker
    [w.apply.remote(func) for w in self.remote_workers()])
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/utils/memory.py", line 29, in ray_get_and_free
    result = ray.get(object_ids)
ray.exceptions.RayTaskError(AttributeError): ray::RolloutWorker (pid=3149, ip=172.31.41.218)
  File "python/ray/_raylet.pyx", line 440, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 441, in ray._raylet.execute_task
  File "python/ray/_raylet.pyx", line 313, in ray._raylet.deserialize_args
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/serialization.py", line 301, in deserialize_objects
    self._deserialize_object(data, metadata, object_id))
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/serialization.py", line 249, in _deserialize_object
    return self._deserialize_pickle5_data(data)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/serialization.py", line 238, in _deserialize_pickle5_data
    obj = pickle.loads(in_band, buffers=buffers)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 173, in __setstate__
    Trainer.__setstate__(self, state)
  File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 980, in __setstate__
    self.workers.local_worker().restore(state["worker"])
AttributeError: 'PPO' object has no attribute 'workers'

Reproduction (REQUIRED)

Run with command: ray exec autoscale.yaml "python ~/cmbrl/run/ray_bug.py --num_cpus 7 --exp_title ray_bug" --start --stop

import argparse
from copy import deepcopy
import errno
import os
import subprocess
import sys

import numpy as np
import ray

import gym
from gym import spaces

from ray.rllib.agents.ppo.ppo_tf_policy import PPOTFPolicy
from ray.rllib.agents.ppo.ppo import PPOTrainer, DEFAULT_CONFIG as DEFAULT_PPO_CONFIG
from ray import tune
from ray.tune import run as run_tune
from ray.tune.registry import register_env

class TestEnv(ray.rllib.MultiAgentEnv):
    def __init__(self):
        super().__init__()
        self.observation_space = spaces.Box(low=0, high=1, shape=(2,))
        self.action_space = spaces.Box(low=0, high=1, shape=(1, ))
        self.increment = 1.0

    def update_curriculum(self, increment):
        self.increment = float(increment)

    def step(self, action):
        return {'agent': np.array([1/self.increment, 0.1])}, {'agent': action['agent'][0]}, \
            {'__all__': True, 'agent': True}, {}

    def reset(self):
        return {'agent': np.array([1/self.increment, 0.1])}


def make_create_env(env_name):
    def create_env(config):
        return TestEnv()
    return create_env

def on_train_result(info):
    trainer = info['trainer']
    outputs = trainer.workers.foreach_worker(lambda ev: ev.foreach_env(
        lambda env: env.update_curriculum(info['result']['training_iteration'])))


def setup_exp(args):
    if args.algorithm == 'PPO':
        config = deepcopy(DEFAULT_PPO_CONFIG)

    config['num_workers'] = args.num_cpus
    env_name = "test_env-v0"
    config['env'] = env_name

    config["callbacks"] = {"on_train_result": on_train_result}

    create_env = make_create_env(env_name)
    env_name = register_env(env_name, create_env)
    env = create_env(env_name)

    policies_to_train = ['agent']

    if args.algorithm == 'PPO':
        policy_graphs = {'agent': (PPOTFPolicy, env.observation_space, env.action_space, {})}

    config.update({
        'multiagent': {
            'policies': policy_graphs,
            'policy_mapping_fn': lambda agent_id: 'agent',
            'policies_to_train': policies_to_train
        }
    })

    # create a custom string that makes looking at the experiment names easier
    def trial_str_creator(trial):
        return "{}_{}".format(trial.trainable_name, trial.experiment_tag)

    exp_dict = {
        'name': args.exp_title,
        'run_or_experiment': args.algorithm,
        'trial_name_creator': trial_str_creator,
        'checkpoint_freq': args.checkpoint_freq,
        'stop': {
            'training_iteration': args.num_iters
        },
        'config': config,
        'num_samples': args.num_samples,
    }
    return exp_dict


if __name__ == "__main__":
    parser = argparse.ArgumentParser('Parse some arguments my guy.')

    parser.add_argument(
        '--algorithm', choices=['PPO'], default='PPO', type=str)
    parser.add_argument('--exp_title', type=str, default='test',
                        help='Informative experiment title to help distinguish results')
    parser.add_argument('--num_cpus', type=int, default=1,
                        help='Number of cpus to run experiment with')
    parser.add_argument('--multi_node', action='store_true', help='Set to true if this will '
                                                                  'be run in cluster mode')
    parser.add_argument('--local_mode', action='store_true', help='Set to true if this will '
                                                                  'be run in local mode')
    parser.add_argument('--train_batch_size', type=int, default=10000,
                        help='How many steps go into a training batch')
    parser.add_argument('--num_iters', type=int, default=350)
    parser.add_argument('--num_samples', type=int, default=1)
    parser.add_argument('--checkpoint_freq', type=int, default=50)
    parser.add_argument('--grid_search', action='store_true',
                        help='If true, grid search hyperparams')

    args = parser.parse_args()
    exp_dict = setup_exp(args)

    if args.multi_node:
        ray.init(address='localhost:6379')
    elif args.local_mode:
        ray.init(local_mode=True)
    else:
        ray.init()

    run_tune(**exp_dict, queue_trials=False, raise_on_failed_trial=True)

Autoscaler

# An unique identifier for the head node and workers of this cluster.
cluster_name: cluster_name

# The minimum number of workers nodes to launch in addition to the head
# node. This number should be >= 0.
min_workers: 0

# The maximum number of workers nodes to launch in addition to the head
# node. This takes precedence over min_workers.
max_workers: 0

# The initial number of worker nodes to launch in addition to the head
# node. When the cluster is first brought up (or when it is refreshed with a
# subsequent `ray up`) this number of nodes will be started.
initial_workers: 0

# Whether or not to autoscale aggressively. If this is enabled, if at any point
#   we would start more workers, we start at least enough to bring us to
#   initial_workers.
autoscaling_mode: default

# This executes all commands on all nodes in the docker container,
# and opens all the necessary ports to support the Ray cluster.
# Empty string means disabled.
docker:
    image: "" # e.g., tensorflow/tensorflow:1.5.0-py3
    container_name: "" # e.g. ray_docker
    # If true, pulls latest version of image. Otherwise, `docker run` will only pull the image
    # if no cached version is present.
    pull_before_run: True
    run_options: []  # Extra options to pass into "docker run"

    # Example of running a GPU head with CPU workers
    # head_image: "tensorflow/tensorflow:1.13.1-py3"
    # head_run_options:
    #     - --runtime=nvidia

    # worker_image: "ubuntu:18.04"
    # worker_run_options: []

# The autoscaler will scale up the cluster to this target fraction of resource
# usage. For example, if a cluster of 10 nodes is 100% busy and
# target_utilization is 0.8, it would resize the cluster to 13. This fraction
# can be decreased to increase the aggressiveness of upscaling.
# This value must be less than 1.0 for scaling to happen.
target_utilization_fraction: 0.8

# If a node is idle for this many minutes, it will be removed.
idle_timeout_minutes: 5

# Cloud-provider specific configuration.
provider:
    type: aws
    region: us-west-2
    # Availability zone(s), comma-separated, that nodes may be launched in.
    # Nodes are currently spread between zones by a round-robin approach,
    # however this implementation detail should not be relied upon.
    availability_zone: us-west-2a,us-west-2b
    cache_stopped_nodes: False

# How Ray will authenticate with newly launched nodes.
auth:
    ssh_user: ubuntu
# By default Ray creates a new private keypair, but you can also use your own.
# If you do so, make sure to also set "KeyName" in the head and worker node
# configurations below.
#    ssh_private_key: /path/to/your/key.pem

# Provider-specific config for the head node, e.g. instance type. By default
# Ray will auto-configure unspecified fields such as SubnetId and KeyName.
# For more documentation on available fields, see:
# http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances
head_node:
    InstanceType: c4.4xlarge
    ImageId: ami-030c12d850a83dff1 # Deep Learning AMI (Ubuntu) Version 24.3

    # You can provision additional disk space with a conf as follows
    BlockDeviceMappings:
        - DeviceName: /dev/sda1
          Ebs:
              VolumeSize: 100

    # Additional options in the boto docs.

# Provider-specific config for worker nodes, e.g. instance type. By default
# Ray will auto-configure unspecified fields such as SubnetId and KeyName.
# For more documentation on available fields, see:
# http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances
worker_nodes:
    InstanceType: c4.4xlarge
    ImageId: ami-030c12d850a83dff1 # Deep Learning AMI (Ubuntu) Version 24.3

    # Run workers on spot by default. Comment this out to use on-demand.
    InstanceMarketOptions:
        MarketType: spot
        # Additional options can be found in the boto docs, e.g.
        #   SpotOptions:
        #       MaxPrice: MAX_HOURLY_PRICE

    # Additional options in the boto docs.

# List of commands that will be run before `setup_commands`. If docker is
# enabled, these commands will run outside the container and before docker
# is setup.
initialization_commands: []

# List of shell commands to run to set up nodes.
setup_commands:
    - pip install ray==0.8.2
    
# Custom commands that will be run on the head node after common setup.
head_setup_commands: []

# Custom commands that will be run on worker nodes after common setup.
worker_setup_commands: []

# Command to start ray on the head node. You don't need to change this.
head_start_ray_commands:
    - cd ~/cmbrl && git fetch && git checkout origin/ray_bug
    - cd ~/cmbrl/gym_minigrid && git pull && cd ..
    - ray stop
    - ulimit -n 65536; ray start --head --redis-port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml

# Command to start ray on worker nodes. You don't need to change this.
worker_start_ray_commands:
    - cd cmbrl && git fetch && git checkout origin/ray_bug
    - cd ~/cmbrl/gym_minigrid && git pull && cd ..
    - ray stop
    - ulimit -n 65536; ray start --address=$RAY_HEAD_IP:6379 --object-manager-port=8076

If we cannot run your script, we cannot fix your issue.

I have verified my script runs in a clean environment and reproduces the issue.
I have verified the issue also occurs with the latest wheels.

Issue Analytics

State:
Created 3 years ago
Comments:9 (6 by maintainers)

Top GitHub Comments

2reactions

andrew-rosenfeld-tscommented, Jun 26, 2020

I have run into this issue also, when trying to do distributed rollouts using a previously trained agent. I believe the “traditional” way (in the docs etc) to load/use a previously trained agent is using the trainer object itself (in this case PPOTrainer), and call compute_action on it. It would seem natural to share it across Ray workers to do distributed rollouts, but PPOTrainer doesn’t seem to be serializable. (You can work around this by loading the PPOTrainer locally in each worker, though it feels a little wrong, seems a bit slow, and generates a ton of logging…)

Here’s a quick repro script (tested on the latest wheel for Python 3.6 on Linux as of 2020-06-23) that gives the above error when trying to serialize a PPOTrainer.

# first, train a tiny RLLib agent on a well-known game and save a checkpoint:
# rllib train --stop '{"training_iteration": 10}' --checkpoint-at-end --run=PPO --env=CartPole-v0


import os
import pickle

import gym
import ray
from ray.rllib.agents import ppo
from ray.rllib.utils.spaces.space_utils import flatten_to_single_ndarray

checkpoint_path = os.path.expanduser('~/ray_results/default/PPO_CartPole-v0_0_2020-06-23_15-58-01fr72fr0f/checkpoint_10/checkpoint-10')

ray.init()

run_base_dir = os.path.dirname(os.path.dirname(checkpoint_path))
config_path = os.path.join(run_base_dir, 'params.pkl')
with open(config_path, 'rb') as f:
    config = pickle.load(f)

trainer = ppo.PPOTrainer(config=config)
trainer.restore(checkpoint_path)


@ray.remote
def run_episode():
    env = gym.make('CartPole-v0')

    reward_total = 0.0
    done = False
    obs = env.reset()

    while not done:
        action = trainer.compute_action(obs)
        action = flatten_to_single_ndarray(action)

        next_obs, reward, done, info = env.step(action)

        reward_total += reward
        obs = next_obs

    return reward_total

run_object_ids = [run_episode.remote() for _ in range(10)]
print('Rewards', ray.get(run_object_ids))

You get:

Traceback (most recent call last):
  File "<redacted>/lib/python3.6/site-packages/ray/function_manager.py", line 177, in fetch_and_register_remote_function
    function = pickle.loads(serialized_function)
  File "<redacted>/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 192, in __setstate__
    Trainer.__setstate__(self, state)
  File "<redacted>/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 1202, in __setstate__
    self.workers.local_worker().restore(state["worker"])
AttributeError: 'PPO' object has no attribute 'workers'

Should PPOTrainer be serializable?

It would be useful for distributed rollouts for some sort of post-training evaluation process, and also if you want to train a new agent in an environment that includes an agent trained earlier (e.g. imagine AlphaGo training against a static earlier version of itself).

1reaction

ericlcommented, Apr 30, 2020

@kanaadp based on your fix it looks like this isn’t a bug in RLlib – you just need to make sure your closure does not capture the “info” dict. The reason is the info dict isn’t serializable so will raise errors when called remotely.

Top Results From Across the Web

RLlib: using evaluation workers on previously trained models

RLlib : using evaluation workers on previously trained models ... AttributeError: 'PPO' object has no attribute 'evaluation_workers'.

ValueError: RolloutWorker has no input_reader object

Show activity on this post. I am using RLlib and I am trying to run APEX_DDPG with tune on a multi-agent environment with...

Error when trying to use Evolution Strategies on custom env

My custom environment seems to work fine with PPO, however, when using ES I get ... AttributeError: 'NoneType' object has no attribute 'max_episode_steps'....

Anatomy of a custom environment for RLlib - Medium

represent a problem to solve with RL; build custom Gym environments that work well with RLlib; structure a Git repo to support custom...

Ray Documentation - Read the Docs

Ray should work with Python 2 and Python 3. ... name of a built-on algorithm (e.g. RLLib's DQN or PPO), a user-defined trainable...