Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

[Feature] Rust API for Ray

See original GitHub issue

Search before asking

I had searched in the issues and found no similar feature requirement.

Description

Problem Description

Introduction

Ray currently allows for a very attractive distributed programming model that can be embedded within existing programming languages, offering low latency and fine-grained control of tasks and shared objects.

There are even optimized communication models, such as NCCL’s allreduce, although these are pretty task-specific.

The most prominent example of this embedding is Python, the dominant general purpose language for ML/data science. There are also embeddings for Java and C++ which are commonly used in enterprise systems.

A natural next step, the subject of this proposal, Rust, is often touted as the successor to C++ in modern systems. Its popularity is due to its memory and thread safety, user-friendly, zero-cost functional programming idioms, ergonomic packaging and build system, while retaining C++/C-like performance, avoiding GC unpredictability, and having small memory footprint.

There are many new projects in the data/distributed compute industry building on Rust, including InfluxDB-IOx (time series DB), Materialize (streaming SQL view maintenance) built on Timely/Differential Dataflow, Datafusion/Ballista/Arrow-rs (SQL query engine), Databend.rs (realtime analytics), Fluvio (distributed streaming), which can all run distributed over a cluster, constellation-rs/the Hadean SDK (distributed compute platform), polar-rs (dataframe lib), delta-rs (Apache Delta table interface).

We expect that the number of such systems to grow going forward, possibly including next-gen distributed simulation/real-time engines (e.g. games, self-driving, traffic, large scale physics), distributed computing (graphs), databases and query engines, and other forms of distributed execution.

Exposing a Rust API would allow the growing Rust community to leverage Ray’s programming model and possibly drive improvements in the underlying Ray system.

Considerations

The Rust community may not like the thought of using a C++ library (being memory and thread unsafe) under the hood as opposed to a pure Rust library. But as these things go, the benefits may outweigh the reservations.

Alternative libraries for distributed computation also exist in Rust, such as timely-dataflow and constellation-rs. The former is dataflow-based with automatic pipelined communication focusing on data-parallel workloads, and the latter is process-based with explicit communication and (I believe) no built-in fault tolerance, with a spinoff library amadeus doing map reduce style data-parallel stream computation.

However, just like is the case for many of Ray’s workloads, this style of distributed computation may not be suitable to the types of tasks being run, which may demand more fine-grained control, while programming with explicit communication may have high cognitive overhead.

Requirements of a Worker

A worker must be able to:

Talk to the runtime (raylet - Plasma object store and scheduler). This is largely handled by core_worker.
Expose an API to the user embedded in the language (get, put, wait, remote) accepting that language’s native types (or objects), e.g. via generics.
Provide the appropriate object reference semantics within the embedding language to allow for zero-copy reads.

Objective

The end goal is to produce something similar to rust-plasma, which provides a Rust interface to the C++ plasma object store.

The Current Structure of the C++ API

The C++ API exposes a minimal runtime interface (native and local mode):

The ray runtime
Object store
Task submitter and store

Here is the main runtime API.

Local mode is running on a single node, in a single process and without RPC, mainly for testing. We will begin with developing the local mode API for approach validation and fast iteration.

It also exposes the include files which go beyond the basic Ray API, including:

Serialization,
Actor creation

Finally, the C++ API has the following utils:

Process helper
- This is just about starting ray on a node and syncing up with the GCS

Approach

The approach is to use either the autocxx crate or the cxx crate directly to generate a set of workable bindings, either directly to the C++ API if this is feasible, the core_worker directly, or a hybrid of both if deemed necessary, whichever is the happier path. Tests will be created on the Rust side to test out all of the functionality, including more expensive cluster mode integration tests.

Using these tests or otherwise, we will try to find and fix last mile issues, such as functionality that may not play well across language boundaries (e.g. reference counting).

We will use Rust’s procedural macros to instantiate tasks and actors which can provide a similarly pleasant API to Python’s decorators. We may, in addition, provide idiomatic instantiations adding options as mutating methods to tasks, as those seen in the C++/Java APIs.

Roadmap

Implement local mode + tests (on the Rust side).
Extend to cluster mode (native mode) + tests (on the Rust side). Attempt to PR incremental functionality in the following order:
1. get/put. Replace msgpack with the serde or some equivalent serialization trait.
2. task
3. actor
Flesh out details like config, error handling, async semantics
proc macros: decorator-like task & actor instantiations
Publish as a crate on crates.io
- The worker ought to be able to interface with an existing compatible ray runtime with no additional dependency

Test Cases

As a test case, I’d like to try implementing one option for distributed job scheduling for the Ballista distributed SQL engine (which differs from Spark SQL in having a native runtime with a smaller memory footprint). The current state of the job scheduling there is rather primitive. Possibly, Ray could help with query execution that exploits data locality rather than building such scheduling logic from scratch.

As a second test case, I would like to try to implement timely dataflow on top of ray. Perhaps this could allow for streaming SQL queries on top of Ballista/DataFusion. Although I worry about memory usage.

Future Directions

Cross-language:

support for calling tasks
cross-calling with primitive types
cross-lang exception/call stack chaining

Async actors

Can support single or multi-threaded event loops via Rust’s multiple options for async executors (tokio, async-std)

Multi-threaded actors/tasks

Not sure if this is supported out of the hood.
Would be nice if task resource specifications could know about the requested resources in terms of threads.

User-specified compression scheme

user might want to specify compression scheme, based on tradeoffs in time/space
can preempt this initiative by making compression scheme generic (this is already a generic concept in Rust via Serde)
concretely instantiate compression scheme in default_worker compilation as needed, with switch statement to toggle. Requirement for compression scheme to be global choice throughout cluster, although encoding this data in task_specification is not out of the question either.
not sure how this plays with xlang requirements, in particular the directions for RaySerialization

Direct buffer writing

one can define traits to work with preallocated objects, for types that have fixed size or sizes known in advance.

Use case

No response

Related issues

No response

Are you willing to submit a PR?

Yes I am willing to submit a PR!

Issue Analytics

State:
Created 2 years ago
Reactions:35
Comments:11 (6 by maintainers)

Top GitHub Comments

5reactions

raulchencommented, Nov 22, 2021

Glad to see this proposal! I’m a newbie to Rust. But based on my experiences from Java and C++ API, here are some thoughts that might be useful.

Local mode vs cluster mode: local mode in other languages are still incomplete. That is partially because some APIs are semantically incompatible with local mode (kill actor for example), partially because of the extra work. So in our practice, users don’t usually use local mode. In addition, setting up single-node Ray cluster is much easier now than before. So I’d say the necessity of local mode is not that big. You may want to consider going with the cluster mode directly.
Built on top of the core worker vs the C++ runtime. Supposedly, a new language binding should be built on top of the core worker. And the C++ runtime should only contain C+±specific code. I’m not sure how much of the C++ runtime can be re-used for Rust, except for the process helper. It’s okay for the quick prototype. But if we find something in the C++ runtime can be re-used for other languages, we’d better eventually move them to the core worker.

4reactions

mwtiancommented, Nov 22, 2021

Hi @jon-chuang, great to see this effort! I think the community will be interested in the Ray Rust API proposal, and pros and cons for building on top of Ray C++ API vs core worker API.

Implementing Ray Java and C++ API have been large undertakings, so please don’t shy away from reaching out to the Ray team. Ray team are active in the project slack channel to discuss Ray internals, if you have not joined already!

Top Results From Across the Web

ray - Rust - Docs.rs

API documentation for the Rust `ray` crate.

ray-tracing - Keywords - crates.io: Rust Package Registry

Core functionality for Clay - fast, modular and extendable ray tracer ... Auto-generated Rust bindings for the RenderMan/3Delight|NSI display driver API.

Rust language bindings - Ray Core

After searching for a bit and not finding anything[*], are there any plans or known efforts to create Rust language bindings for Ray?...

Graphics APIs — list of Rust libraries/crates // Lib.rs

Speedy2D aims to be the simplest Rust API for creating a window, rendering graphics, ... v0.6.0 #graphics #rendering #3d #ray-tracing #scene-description ...

7 Useful Rust Library You Should Use In Your Next Project

It helps you to write complex and time-consuming functionality in an easy way. ... Libraries can use the logging API provided by this...