question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[job submission] Support running supervisor actors in worker nodes

See original GitHub issue

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

In some real world use cases, we will deploy the dashboard in a separate node which doesn’t include a Raylet process. We do this because we want to achieve high available of dashboard. But now in job submission, we use ray.init in dashboard to ensure the supervisor actor could be launched. The assumption is that dashboard and Raylet are collocated in one node.

So, can we use Ray Client in dashboard?

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:12 (12 by maintainers)

github_iconTop GitHub Comments

1reaction
SongGuyangcommented, May 13, 2022

Seems we still have more than half a month before June. I will lead discuss first in Ant next week and go back to sync with you.

1reaction
edoakescommented, Mar 22, 2022

I think this is backwards: the ray client server uses a hacky implementation to create the driver process that circumvents the standard process scheduling & runtime_env ref counting path. Instead it should do the same thing that the job submission server does: schedule a regular actor. This would unify how we do process management and environment setup across the board.

To solve the issue of ray.initing to the raylet, maybe we should have the actor calls happen in the per-node agent instead of the main dashboard process?

Read more comments on GitHub >

github_iconTop Results From Across the Web

Ray Jobs Overview — Ray 2.2.0 - the Ray documentation
The Ray Jobs API allows you to submit locally developed applications to a remote Ray Cluster for execution. It simplifies the experience of...
Read more >
Supervision | Akka.NET Documentation
As described in Actor Systems supervision describes a dependency relationship between actors: the supervisor delegates tasks to subordinates and therefore must ...
Read more >
Supervision and Monitoring - Documentation - Akka
Depending on the nature of the work to be supervised and the nature of the ... provided actor, meant to bootstrap the application...
Read more >
Process Monitoring with Supervisord - YouTube
Writeup Here: https://serversforhackers.com/video/p... As some point you'll likely find yourself writing a script which needs to run all the ...
Read more >
Getting Started with Ray | Domino Data Science Blog
Indeed, there are a growing number of domain-specific libraries that work on top of Ray. ... To connect to this Ray runtime from...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found