Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[job submission] Support running supervisor actors in worker nodes

See original GitHub issue

Search before asking

I had searched in the issues and found no similar feature requirement.

Description

In some real world use cases, we will deploy the dashboard in a separate node which doesn’t include a Raylet process. We do this because we want to achieve high available of dashboard. But now in job submission, we use ray.init in dashboard to ensure the supervisor actor could be launched. The assumption is that dashboard and Raylet are collocated in one node.

So, can we use Ray Client in dashboard?

Use case

No response

Related issues

No response

Are you willing to submit a PR?

Yes I am willing to submit a PR!

Issue Analytics

State:
Created a year ago
Comments:12 (12 by maintainers)

Top GitHub Comments

1reaction

SongGuyangcommented, May 13, 2022

Seems we still have more than half a month before June. I will lead discuss first in Ant next week and go back to sync with you.

1reaction

edoakescommented, Mar 22, 2022

I think this is backwards: the ray client server uses a hacky implementation to create the driver process that circumvents the standard process scheduling & runtime_env ref counting path. Instead it should do the same thing that the job submission server does: schedule a regular actor. This would unify how we do process management and environment setup across the board.

To solve the issue of ray.initing to the raylet, maybe we should have the actor calls happen in the per-node agent instead of the main dashboard process?

Top Results From Across the Web

Ray Jobs Overview — Ray 2.2.0 - the Ray documentation

The Ray Jobs API allows you to submit locally developed applications to a remote Ray Cluster for execution. It simplifies the experience of...

Supervision | Akka.NET Documentation

As described in Actor Systems supervision describes a dependency relationship between actors: the supervisor delegates tasks to subordinates and therefore must ...

Supervision and Monitoring - Documentation - Akka

Depending on the nature of the work to be supervised and the nature of the ... provided actor, meant to bootstrap the application...

Process Monitoring with Supervisord - YouTube

Writeup Here: https://serversforhackers.com/video/p... As some point you'll likely find yourself writing a script which needs to run all the ...

Getting Started with Ray | Domino Data Science Blog

Indeed, there are a growing number of domain-specific libraries that work on top of Ray. ... To connect to this Ray runtime from...