[Feature] Rust API for Ray
See original GitHub issueSearch before asking
- I had searched in the issues and found no similar feature requirement.
Description
Problem Description
Introduction
Ray currently allows for a very attractive distributed programming model that can be embedded within existing programming languages, offering low latency and fine-grained control of tasks and shared objects.
There are even optimized communication models, such as NCCL’s allreduce, although these are pretty task-specific.
The most prominent example of this embedding is Python, the dominant general purpose language for ML/data science. There are also embeddings for Java and C++ which are commonly used in enterprise systems.
A natural next step, the subject of this proposal, Rust, is often touted as the successor to C++ in modern systems. Its popularity is due to its memory and thread safety, user-friendly, zero-cost functional programming idioms, ergonomic packaging and build system, while retaining C++/C-like performance, avoiding GC unpredictability, and having small memory footprint.
There are many new projects in the data/distributed compute industry building on Rust, including InfluxDB-IOx (time series DB), Materialize (streaming SQL view maintenance) built on Timely/Differential Dataflow, Datafusion/Ballista/Arrow-rs (SQL query engine), Databend.rs (realtime analytics), Fluvio (distributed streaming), which can all run distributed over a cluster, constellation-rs/the Hadean SDK (distributed compute platform), polar-rs (dataframe lib), delta-rs (Apache Delta table interface).
We expect that the number of such systems to grow going forward, possibly including next-gen distributed simulation/real-time engines (e.g. games, self-driving, traffic, large scale physics), distributed computing (graphs), databases and query engines, and other forms of distributed execution.
Exposing a Rust API would allow the growing Rust community to leverage Ray’s programming model and possibly drive improvements in the underlying Ray system.
Considerations
The Rust community may not like the thought of using a C++ library (being memory and thread unsafe) under the hood as opposed to a pure Rust library. But as these things go, the benefits may outweigh the reservations.
Alternative libraries for distributed computation also exist in Rust, such as timely-dataflow
and constellation-rs
. The former is dataflow-based with automatic pipelined communication focusing on data-parallel workloads, and the latter is process-based with explicit communication and (I believe) no built-in fault tolerance, with a spinoff library amadeus
doing map reduce style data-parallel stream computation.
However, just like is the case for many of Ray’s workloads, this style of distributed computation may not be suitable to the types of tasks being run, which may demand more fine-grained control, while programming with explicit communication may have high cognitive overhead.
Requirements of a Worker
A worker must be able to:
- Talk to the runtime (raylet - Plasma object store and scheduler). This is largely handled by
core_worker
. - Expose an API to the user embedded in the language (get, put, wait, remote) accepting that language’s native types (or objects), e.g. via generics.
- Provide the appropriate object reference semantics within the embedding language to allow for zero-copy reads.
Objective
- The end goal is to produce something similar to rust-plasma, which provides a Rust interface to the C++ plasma object store.
The Current Structure of the C++ API
The C++ API exposes a minimal runtime interface (native and local mode):
- The ray runtime
- Object store
- Task submitter and store
Here is the main runtime API.
Local mode is running on a single node, in a single process and without RPC, mainly for testing. We will begin with developing the local mode API for approach validation and fast iteration.
It also exposes the include files which go beyond the basic Ray API, including:
- Serialization,
- Actor creation
Finally, the C++ API has the following utils:
- Process helper
- This is just about starting ray on a node and syncing up with the GCS
Approach
The approach is to use either the autocxx
crate or the cxx
crate directly to generate a set of workable bindings, either directly to the C++ API if this is feasible, the core_worker
directly, or a hybrid of both if deemed necessary, whichever is the happier path. Tests will be created on the Rust side to test out all of the functionality, including more expensive cluster mode integration tests.
Using these tests or otherwise, we will try to find and fix last mile issues, such as functionality that may not play well across language boundaries (e.g. reference counting).
We will use Rust’s procedural macros to instantiate tasks and actors which can provide a similarly pleasant API to Python’s decorators. We may, in addition, provide idiomatic instantiations adding options as mutating methods to tasks, as those seen in the C++/Java APIs.
Roadmap
- Implement local mode + tests (on the Rust side).
- Extend to cluster mode (native mode) + tests (on the Rust side). Attempt to PR incremental functionality in the following order:
- get/put. Replace msgpack with the serde or some equivalent serialization trait.
- task
- actor
- Flesh out details like config, error handling, async semantics
- proc macros: decorator-like task & actor instantiations
- Publish as a crate on crates.io
- The worker ought to be able to interface with an existing compatible ray runtime with no additional dependency
Test Cases
As a test case, I’d like to try implementing one option for distributed job scheduling for the Ballista distributed SQL engine (which differs from Spark SQL in having a native runtime with a smaller memory footprint). The current state of the job scheduling there is rather primitive. Possibly, Ray could help with query execution that exploits data locality rather than building such scheduling logic from scratch.
As a second test case, I would like to try to implement timely dataflow on top of ray. Perhaps this could allow for streaming SQL queries on top of Ballista/DataFusion. Although I worry about memory usage.
Future Directions
Cross-language:
- support for calling tasks
- cross-calling with primitive types
- cross-lang exception/call stack chaining
Async actors
- Can support single or multi-threaded event loops via Rust’s multiple options for async executors (tokio, async-std)
Multi-threaded actors/tasks
- Not sure if this is supported out of the hood.
- Would be nice if task resource specifications could know about the requested resources in terms of threads.
User-specified compression scheme
- user might want to specify compression scheme, based on tradeoffs in time/space
- can preempt this initiative by making compression scheme generic (this is already a generic concept in Rust via Serde)
- concretely instantiate compression scheme in default_worker compilation as needed, with switch statement to toggle. Requirement for compression scheme to be global choice throughout cluster, although encoding this data in task_specification is not out of the question either.
- not sure how this plays with xlang requirements, in particular the directions for
RaySerialization
Direct buffer writing
- one can define traits to work with preallocated objects, for types that have fixed size or sizes known in advance.
Use case
No response
Related issues
No response
Are you willing to submit a PR?
- Yes I am willing to submit a PR!
Issue Analytics
- State:
- Created 2 years ago
- Reactions:35
- Comments:11 (6 by maintainers)
Glad to see this proposal! I’m a newbie to Rust. But based on my experiences from Java and C++ API, here are some thoughts that might be useful.
Hi @jon-chuang, great to see this effort! I think the community will be interested in the Ray Rust API proposal, and pros and cons for building on top of Ray C++ API vs core worker API.
Implementing Ray Java and C++ API have been large undertakings, so please don’t shy away from reaching out to the Ray team. Ray team are active in the project slack channel to discuss Ray internals, if you have not joined already!