[rllib] AttributeError: 'PPO' object has no attribute 'workers'
See original GitHub issueWhat is the problem?
Ray version: 0.8.2 (also found on 0.8.1), EC2 ami: ami-030c12d850a83dff1 (linux)
Traceback (most recent call last):
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/tune/trial_runner.py", line 459, in _process_trial
result = self.trial_executor.fetch_result(trial)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/tune/ray_trial_executor.py", line 377, in fetch_result
result = ray.get(trial_future[0], DEFAULT_GET_TIMEOUT)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/worker.py", line 1504, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(AttributeError): ray::PPO.train() (pid=3139, ip=172.31.41.218)
File "python/ray/_raylet.pyx", line 452, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 430, in ray._raylet.execute_task.function_executor
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 494, in train
raise e
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 483, in train
result = Trainable.train(self)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/tune/trainable.py", line 316, in train
self._log_result(result)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 534, in _log_result
"result": result,
File "/home/ubuntu/cmbrl/run/ray_bug.py", line 45, in on_train_result
outputs = trainer.workers.foreach_worker(lambda ev: ev.foreach_env(
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/evaluation/worker_set.py", line 112, in foreach_worker
[w.apply.remote(func) for w in self.remote_workers()])
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/utils/memory.py", line 29, in ray_get_and_free
result = ray.get(object_ids)
ray.exceptions.RayTaskError(AttributeError): ray::RolloutWorker (pid=3149, ip=172.31.41.218)
File "python/ray/_raylet.pyx", line 440, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 441, in ray._raylet.execute_task
File "python/ray/_raylet.pyx", line 313, in ray._raylet.deserialize_args
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/serialization.py", line 301, in deserialize_objects
self._deserialize_object(data, metadata, object_id))
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/serialization.py", line 249, in _deserialize_object
return self._deserialize_pickle5_data(data)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/serialization.py", line 238, in _deserialize_pickle5_data
obj = pickle.loads(in_band, buffers=buffers)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer_template.py", line 173, in __setstate__
Trainer.__setstate__(self, state)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/ray/rllib/agents/trainer.py", line 980, in __setstate__
self.workers.local_worker().restore(state["worker"])
AttributeError: 'PPO' object has no attribute 'workers'
Reproduction (REQUIRED)
Run with command: ray exec autoscale.yaml "python ~/cmbrl/run/ray_bug.py --num_cpus 7 --exp_title ray_bug" --start --stop
import argparse
from copy import deepcopy
import errno
import os
import subprocess
import sys
import numpy as np
import ray
import gym
from gym import spaces
from ray.rllib.agents.ppo.ppo_tf_policy import PPOTFPolicy
from ray.rllib.agents.ppo.ppo import PPOTrainer, DEFAULT_CONFIG as DEFAULT_PPO_CONFIG
from ray import tune
from ray.tune import run as run_tune
from ray.tune.registry import register_env
class TestEnv(ray.rllib.MultiAgentEnv):
def __init__(self):
super().__init__()
self.observation_space = spaces.Box(low=0, high=1, shape=(2,))
self.action_space = spaces.Box(low=0, high=1, shape=(1, ))
self.increment = 1.0
def update_curriculum(self, increment):
self.increment = float(increment)
def step(self, action):
return {'agent': np.array([1/self.increment, 0.1])}, {'agent': action['agent'][0]}, \
{'__all__': True, 'agent': True}, {}
def reset(self):
return {'agent': np.array([1/self.increment, 0.1])}
def make_create_env(env_name):
def create_env(config):
return TestEnv()
return create_env
def on_train_result(info):
trainer = info['trainer']
outputs = trainer.workers.foreach_worker(lambda ev: ev.foreach_env(
lambda env: env.update_curriculum(info['result']['training_iteration'])))
def setup_exp(args):
if args.algorithm == 'PPO':
config = deepcopy(DEFAULT_PPO_CONFIG)
config['num_workers'] = args.num_cpus
env_name = "test_env-v0"
config['env'] = env_name
config["callbacks"] = {"on_train_result": on_train_result}
create_env = make_create_env(env_name)
env_name = register_env(env_name, create_env)
env = create_env(env_name)
policies_to_train = ['agent']
if args.algorithm == 'PPO':
policy_graphs = {'agent': (PPOTFPolicy, env.observation_space, env.action_space, {})}
config.update({
'multiagent': {
'policies': policy_graphs,
'policy_mapping_fn': lambda agent_id: 'agent',
'policies_to_train': policies_to_train
}
})
# create a custom string that makes looking at the experiment names easier
def trial_str_creator(trial):
return "{}_{}".format(trial.trainable_name, trial.experiment_tag)
exp_dict = {
'name': args.exp_title,
'run_or_experiment': args.algorithm,
'trial_name_creator': trial_str_creator,
'checkpoint_freq': args.checkpoint_freq,
'stop': {
'training_iteration': args.num_iters
},
'config': config,
'num_samples': args.num_samples,
}
return exp_dict
if __name__ == "__main__":
parser = argparse.ArgumentParser('Parse some arguments my guy.')
parser.add_argument(
'--algorithm', choices=['PPO'], default='PPO', type=str)
parser.add_argument('--exp_title', type=str, default='test',
help='Informative experiment title to help distinguish results')
parser.add_argument('--num_cpus', type=int, default=1,
help='Number of cpus to run experiment with')
parser.add_argument('--multi_node', action='store_true', help='Set to true if this will '
'be run in cluster mode')
parser.add_argument('--local_mode', action='store_true', help='Set to true if this will '
'be run in local mode')
parser.add_argument('--train_batch_size', type=int, default=10000,
help='How many steps go into a training batch')
parser.add_argument('--num_iters', type=int, default=350)
parser.add_argument('--num_samples', type=int, default=1)
parser.add_argument('--checkpoint_freq', type=int, default=50)
parser.add_argument('--grid_search', action='store_true',
help='If true, grid search hyperparams')
args = parser.parse_args()
exp_dict = setup_exp(args)
if args.multi_node:
ray.init(address='localhost:6379')
elif args.local_mode:
ray.init(local_mode=True)
else:
ray.init()
run_tune(**exp_dict, queue_trials=False, raise_on_failed_trial=True)
Autoscaler
# An unique identifier for the head node and workers of this cluster.
cluster_name: cluster_name
# The minimum number of workers nodes to launch in addition to the head
# node. This number should be >= 0.
min_workers: 0
# The maximum number of workers nodes to launch in addition to the head
# node. This takes precedence over min_workers.
max_workers: 0
# The initial number of worker nodes to launch in addition to the head
# node. When the cluster is first brought up (or when it is refreshed with a
# subsequent `ray up`) this number of nodes will be started.
initial_workers: 0
# Whether or not to autoscale aggressively. If this is enabled, if at any point
# we would start more workers, we start at least enough to bring us to
# initial_workers.
autoscaling_mode: default
# This executes all commands on all nodes in the docker container,
# and opens all the necessary ports to support the Ray cluster.
# Empty string means disabled.
docker:
image: "" # e.g., tensorflow/tensorflow:1.5.0-py3
container_name: "" # e.g. ray_docker
# If true, pulls latest version of image. Otherwise, `docker run` will only pull the image
# if no cached version is present.
pull_before_run: True
run_options: [] # Extra options to pass into "docker run"
# Example of running a GPU head with CPU workers
# head_image: "tensorflow/tensorflow:1.13.1-py3"
# head_run_options:
# - --runtime=nvidia
# worker_image: "ubuntu:18.04"
# worker_run_options: []
# The autoscaler will scale up the cluster to this target fraction of resource
# usage. For example, if a cluster of 10 nodes is 100% busy and
# target_utilization is 0.8, it would resize the cluster to 13. This fraction
# can be decreased to increase the aggressiveness of upscaling.
# This value must be less than 1.0 for scaling to happen.
target_utilization_fraction: 0.8
# If a node is idle for this many minutes, it will be removed.
idle_timeout_minutes: 5
# Cloud-provider specific configuration.
provider:
type: aws
region: us-west-2
# Availability zone(s), comma-separated, that nodes may be launched in.
# Nodes are currently spread between zones by a round-robin approach,
# however this implementation detail should not be relied upon.
availability_zone: us-west-2a,us-west-2b
cache_stopped_nodes: False
# How Ray will authenticate with newly launched nodes.
auth:
ssh_user: ubuntu
# By default Ray creates a new private keypair, but you can also use your own.
# If you do so, make sure to also set "KeyName" in the head and worker node
# configurations below.
# ssh_private_key: /path/to/your/key.pem
# Provider-specific config for the head node, e.g. instance type. By default
# Ray will auto-configure unspecified fields such as SubnetId and KeyName.
# For more documentation on available fields, see:
# http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances
head_node:
InstanceType: c4.4xlarge
ImageId: ami-030c12d850a83dff1 # Deep Learning AMI (Ubuntu) Version 24.3
# You can provision additional disk space with a conf as follows
BlockDeviceMappings:
- DeviceName: /dev/sda1
Ebs:
VolumeSize: 100
# Additional options in the boto docs.
# Provider-specific config for worker nodes, e.g. instance type. By default
# Ray will auto-configure unspecified fields such as SubnetId and KeyName.
# For more documentation on available fields, see:
# http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.ServiceResource.create_instances
worker_nodes:
InstanceType: c4.4xlarge
ImageId: ami-030c12d850a83dff1 # Deep Learning AMI (Ubuntu) Version 24.3
# Run workers on spot by default. Comment this out to use on-demand.
InstanceMarketOptions:
MarketType: spot
# Additional options can be found in the boto docs, e.g.
# SpotOptions:
# MaxPrice: MAX_HOURLY_PRICE
# Additional options in the boto docs.
# List of commands that will be run before `setup_commands`. If docker is
# enabled, these commands will run outside the container and before docker
# is setup.
initialization_commands: []
# List of shell commands to run to set up nodes.
setup_commands:
- pip install ray==0.8.2
# Custom commands that will be run on the head node after common setup.
head_setup_commands: []
# Custom commands that will be run on worker nodes after common setup.
worker_setup_commands: []
# Command to start ray on the head node. You don't need to change this.
head_start_ray_commands:
- cd ~/cmbrl && git fetch && git checkout origin/ray_bug
- cd ~/cmbrl/gym_minigrid && git pull && cd ..
- ray stop
- ulimit -n 65536; ray start --head --redis-port=6379 --object-manager-port=8076 --autoscaling-config=~/ray_bootstrap_config.yaml
# Command to start ray on worker nodes. You don't need to change this.
worker_start_ray_commands:
- cd cmbrl && git fetch && git checkout origin/ray_bug
- cd ~/cmbrl/gym_minigrid && git pull && cd ..
- ray stop
- ulimit -n 65536; ray start --address=$RAY_HEAD_IP:6379 --object-manager-port=8076
If we cannot run your script, we cannot fix your issue.
- I have verified my script runs in a clean environment and reproduces the issue.
- I have verified the issue also occurs with the latest wheels.
Issue Analytics
- State:
- Created 3 years ago
- Comments:9 (6 by maintainers)
Top Results From Across the Web
RLlib: using evaluation workers on previously trained models
RLlib : using evaluation workers on previously trained models ... AttributeError: 'PPO' object has no attribute 'evaluation_workers'.
Read more >ValueError: RolloutWorker has no input_reader object
Show activity on this post. I am using RLlib and I am trying to run APEX_DDPG with tune on a multi-agent environment with...
Read more >Error when trying to use Evolution Strategies on custom env
My custom environment seems to work fine with PPO, however, when using ES I get ... AttributeError: 'NoneType' object has no attribute 'max_episode_steps'....
Read more >Anatomy of a custom environment for RLlib - Medium
represent a problem to solve with RL; build custom Gym environments that work well with RLlib; structure a Git repo to support custom...
Read more >Ray Documentation - Read the Docs
Ray should work with Python 2 and Python 3. ... name of a built-on algorithm (e.g. RLLib's DQN or PPO), a user-defined trainable...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
I have run into this issue also, when trying to do distributed rollouts using a previously trained agent. I believe the “traditional” way (in the docs etc) to load/use a previously trained agent is using the trainer object itself (in this case PPOTrainer), and call
compute_action
on it. It would seem natural to share it across Ray workers to do distributed rollouts, but PPOTrainer doesn’t seem to be serializable. (You can work around this by loading the PPOTrainer locally in each worker, though it feels a little wrong, seems a bit slow, and generates a ton of logging…)Here’s a quick repro script (tested on the latest wheel for Python 3.6 on Linux as of 2020-06-23) that gives the above error when trying to serialize a PPOTrainer.
You get:
Should PPOTrainer be serializable?
It would be useful for distributed rollouts for some sort of post-training evaluation process, and also if you want to train a new agent in an environment that includes an agent trained earlier (e.g. imagine AlphaGo training against a static earlier version of itself).
@kanaadp based on your fix it looks like this isn’t a bug in RLlib – you just need to make sure your closure does not capture the “info” dict. The reason is the info dict isn’t serializable so will raise errors when called remotely.