question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

ray.remote decorators with tune.run for parallelization

See original GitHub issue

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 10.14.6
  • Ray installed from (source or binary): source
  • Ray version: ray 0.7.3
  • Python version:python 3.6
  • Exact command to reproduce:

Describe the problem

So i’m trying to use the ray decorator @ray.remote() to be able to set number of CPUs and GPU along with customizing my schedulers and algorithms with the help of tune. However when i try and use it this way for parallel computation it throws errors.

I’ve used the following code to specify resources, and it throws the following errors:

Source code

“”“This test checks that HyperOpt is functional. It also checks that it is usable with a separate scheduler. “””

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import ray
from ray.tune import run, grid_search
from ray.tune.schedulers import AsyncHyperBandScheduler
from ray.tune.suggest.hyperopt import HyperOptSearch

@ray.remote(num_cpus = 2)
def easy_objective(config, reporter):
    import time
    time.sleep(0.2)
    assert type(config["activation"]) == str, \
        "Config is incorrect: {}".format(type(config["activation"]))
    for i in range(config["iterations"]):
        reporter(
            timesteps_total=i,
            mean_loss=(config["height"] - 14)**2 - abs(config["width"] - 3))
        time.sleep(0.02)


if __name__ == "__main__":
    import argparse
    from hyperopt import hp

    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--smoke-test", action="store_true", help="Finish quickly for testing")
    args, _ = parser.parse_known_args()
    ray.init()

    space = {
        "width": hp.uniform("width", 0, 20),
        "height": hp.uniform("height", -100, 100),
        "activation": hp.choice("activation", ["relu", "tanh"])
    }

    current_best_params = [
        {
            "width": 1,
            "height": 2,
            "activation": 0  # Activation will be relu
        },
        {
            "width": 4,
            "height": 2,
            "activation": 1  # Activation will be tanh
        }
    ]

    config = {
        "num_samples": 10 if args.smoke_test else 1000,
        "config": {
            "iterations": 100,
        },
        "stop": {
            "timesteps_total": 100
        },
    }
    algo = HyperOptSearch(
        space,
        metric="mean_loss",
        mode="min",
        points_to_evaluate=current_best_params)
    scheduler = AsyncHyperBandScheduler(metric="mean_loss", mode="min")
    run(easy_objective.remote(), search_alg=algo, scheduler=scheduler, **config)

Logs

2019-08-29 13:44:58,066 INFO node.py:498 – Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-08-29_13-44-58_066395_102548/logs. 2019-08-29 13:44:58,170 INFO services.py:409 – Waiting for redis server at 127.0.0.1:58008 to respond… 2019-08-29 13:44:58,280 INFO services.py:409 – Waiting for redis server at 127.0.0.1:49449 to respond… 2019-08-29 13:44:58,282 INFO services.py:809 – Starting Redis shard with 6.74 GB max memory. 2019-08-29 13:44:58,292 INFO node.py:512 – Process STDOUT and STDERR is being redirected to /tmp/ray/session_2019-08-29_13-44-58_066395_102548/logs. 2019-08-29 13:44:58,293 INFO services.py:1475 – Starting the Plasma object store with 10.11 GB memory using /dev/shm. Traceback (most recent call last): File “example_ray.py”, line 71, in <module> analysis = run(easy_objective.remote(), search_alg=algo, scheduler=scheduler, **config) File “/usr/local/lib/python3.6/dist-packages/ray/remote_function.py”, line 78, in _remote_proxy return self._remote(args=args, kwargs=kwargs) File “/usr/local/lib/python3.6/dist-packages/ray/remote_function.py”, line 155, in _remote return invocation(args, kwargs) File “/usr/local/lib/python3.6/dist-packages/ray/remote_function.py”, line 134, in invocation kwargs) File “/usr/local/lib/python3.6/dist-packages/ray/signature.py”, line 215, in extend_args keyword_name, function_name)) Exception: No value was provided for the argument ‘config’ for the function ‘easy_objective’. Error in sys.excepthook: Traceback (most recent call last): File “/usr/lib/python3/dist-packages/apport_python_hook.py”, line 63, in apport_excepthook from apport.fileutils import likely_packaged, get_recent_crashes File “/usr/lib/python3/dist-packages/apport/init.py”, line 5, in <module> from apport.report import Report File “/usr/lib/python3/dist-packages/apport/report.py”, line 30, in <module> import apport.fileutils File “/usr/lib/python3/dist-packages/apport/fileutils.py”, line 23, in <module> from apport.packaging_impl import impl as packaging File “/usr/lib/python3/dist-packages/apport/packaging_impl.py”, line 23, in <module> import apt File “/usr/lib/python3/dist-packages/apt/init.py”, line 23, in <module> import apt_pkg ModuleNotFoundError: No module named ‘apt_pkg’

Original exception was: Traceback (most recent call last): File “example_ray.py”, line 71, in <module> analysis = run(easy_objective.remote(), search_alg=algo, scheduler=scheduler, **config) File “/usr/local/lib/python3.6/dist-packages/ray/remote_function.py”, line 78, in _remote_proxy return self._remote(args=args, kwargs=kwargs) File “/usr/local/lib/python3.6/dist-packages/ray/remote_function.py”, line 155, in _remote return invocation(args, kwargs) File “/usr/local/lib/python3.6/dist-packages/ray/remote_function.py”, line 134, in invocation kwargs) File “/usr/local/lib/python3.6/dist-packages/ray/signature.py”, line 215, in extend_args keyword_name, function_name)) Exception: No value was provided for the argument ‘config’ for the function ‘easy_objective’.

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
richardliawcommented, Aug 29, 2019

Ray doesn’t expose an explicit primitive for this, but you could do this yourself via pinning CPUs.

0reactions
richardliawcommented, Aug 30, 2019

Why not just do:

def easy_objective(config, reporter):
    connection = MongoDBConnection()
    do_stuff()
    ---------------------
    ----------------------
    ---------------------

Alternatively, you can call a remote function inside the objective function:

@ray.remote
def outside_world_fn():
    return x

def easy_objective(config, reporter):
    outside_world_fn.remote()
    ---------------------
    ----------------------
    ---------------------
Read more comments on GitHub >

github_iconTop Results From Across the Web

Running Basic Experiments — Ray 2.2.0
The most common way to use Tune is also the simplest: as a parallel experiment runner. If you can define experiment trials in...
Read more >
A Guide To Parallelism and Resources - the Ray documentation
By default, Tune automatically runs N concurrent trials, where N is the number of CPUs (cores) on your machine. # If you have...
Read more >
Simple Parallel Model Selection — Ray 2.2.0
In this example, we'll demonstrate how to quickly write a hyperparameter tuning script that evaluates a set of hyperparameters in parallel.
Read more >
Using Ray for Highly Parallelizable Tasks
We use the @ray.remote decorator to create a Ray task. A task is like a function, except the result is returned asynchronously.
Read more >
Tips for first-time users — Ray 2.2.0
With Ray, the invocation of every remote operation (e.g., task, actor method) is asynchronous. This means that the operation immediately returns a promise/ ......
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found