Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Discuss the value of `lift` in the 6.0 world

See original GitHub issue

I’d like to talk about removing lift and the ceremony around that.

Why we had `lift`

lift is a really cool architecture for a couple of reasons:

It allowed us to control the return type of operators for better dot-chaining. This meant that users could create custom operators on a class that extends Observable, then override lift to return the custom class, allowing the custom operators to mix in the same dot-chains as standard operators.
It gave us a centralized point that we could hook to affect (almost) all operators. This was going to enable us to do fancy reporting or build tooling around RxJS Observables.

Lift In The World Of Pipeable Operators

With Pipeable operators, we’re no longer dot-chaining everything. This means that controlling the return type no longer matters, and developing a custom operator is as simple as writing a higher-order function. This calls into question the value of lift, since that eliminates benefit number 1 listed above.

Given that all operators will likely go through the pipe method, or otherwise carry a similar signature. It seems like there would be ample ways to hook our operators for any sort of tooling we might want to do. Either way, the promise of lift as a centralized point to hook everything never came through, and in fact there were always some operators that just didn’t use lift so it wouldn’t have been perfect.

Proposed change for 6.0

Remove lift process entirely.

I think the benefit here will be a sharp reduction in the size of the library. As there will no longer be a need for all of the Operator classes in the library, nor will there be calls to lift or implementations of lift on Observable or Subject.

Interested in all of your thoughts… especially @trxcllnt

Issue Analytics

State:
Created 6 years ago
Comments:16 (13 by maintainers)

Top GitHub Comments

4reactions

trxcllntcommented, Nov 15, 2017

The Operator classes were an optimization tailored for what JIT’d well in v8 at the time (more details in the original Lifted Observable proposal, and below). If operator functions JIT (or tree-shake, etc.) better now, then we should do that.

In fact, the Operator interface was explicitly designed to mimic the runtime signature of regular functions, but for perf we implemented all of them as classes:

interface Operator<T, R> {
  call(subscriber: Subscriber<R>, source: any): TeardownLogic;
}

// these are both valid:
class OperatorA() {
  call(sink, source) { return source.subscribe(sink) }
}
source.lift(new OperatorA());

function OperatorB(source) {
  return source.subscribe(this);
}
source.lift(OperatorB);

I’ve even proposed updating the Operator interface to reflect this (at the time the interface was written, TS didn’t support hybrid types), but got push back that it could be confusing.

What’s the deal with `lift`?

@felixfbecker @staltz can’t any operator that can be implemented with (B) be imlpemented with
export const myOperator = () => (source: Observable<T>) => new Observable<R>(observer => {
  // operator code
})
?

Totally! That’s how Rx < 5 was implemented (until ~1.5 years ago or so, when Matt switched it to Observable/Observer pairs). The issue is that (at the time) JITs weren’t great at inlining anonymous function invocations, so no matter how much we optimized things like inner disposables, there was always a performance barrier for throughput.

JITs did perform well on the linked-list-of-subscribers-style (as opposed to the “linked-list-of-closures” style), which was a big reason we went with that in 5. We hypothesized the linked-list-of-subscribers style is faster because vtable lookup for the Subscriber methods is very fast, and v8 can represent them as constant_function descriptors since they’re on the prototype. For comparison:

function map<T, R>(source: Observable<T>, sel: (x: T) => R) {
  return new Observable<R>((sink: Observer<R>) => {
    return source.subscribe({
      next(x) { sink.next(sel(x)); }, // <-- a new function on each subscription
      error: sink.error.bind(sink),
      complete: sink.complete.bind(sink)
    });
  });
}
// vs.
class MapSubscriber<T, R> extends Subscriber<T> {
  constructor(sink: Observer<R>, sel: (x: T) => R) {
    super(sink);
    this.sel = sel;
  }
  next(x) { this.sink.next(this.sel(x)); } // <-- represented as a v8 `constant_function`
}
class Observable {
  map(sel: (x: T) => R) {
    return new Observable<R>((sink: Observer<R>) => {
      return this.subscribe(new MapSubscriber(sink, sel));
    })
  }
}

The same goes for operator classes vs. operator closures, with the added bonus that v8 optimizes lookups of properties assigned in the constructor. At the time these lookups were faster than accessing values from closure scope. I’m not sure if this is still true in turbofan, or whether turbofan does something similar as other VM’s (like .NET), optimizing closures to anonymous inner classes?

If we strip away all the ceremony around JIT optimizations, lift is just a function that maps a source Observable to a result Observable using an operator function that maps a sink Observer to a source Observer:

function lift<T, R>(source: Observable<T>, operator: (sink: Observer<R>) => Observer<T>): Observable<R> {
  return new Observable<R>((sink: Observer<R>) => {
    return source.subscribe(operator(sink));
  });
}

Another way to think about it is you can create an Observable chain going “down”:

srcObservable -> mapObservable -> filterObservable -> subscribe: (sink) => Subscription

…by composing operator functions that create Subscriber chains going back “up”:

src <- mapSubscriber <- filterSubscriber <- sink (aka the finalSubscriber)

In a sense the Observable chain is really just a factory for creating Subscriber chains, which is where the real logic lives.

In practice, we found it was necessary for the operator function to control the source.subscribe() call in a few cases (skip/take, multicast, etc.), so the real definition of lift is more like this:

function lift<T, R>(source: Observable<T>, operator(source: Observable<T>, sink: Observer<R>) => Subscription<T>): Observable<R> {
  return new Observable<R>((sink: Observer<R>) => {
    return operator(source, sink);
  });
}

But since we know the VM can represent functions on the prototype as constant_functions (and can embed the function pointer into optimizable functions), we can transform the lift function into its prototype form:

class Observable<T> {
  lift<R>(operator: Operator<T, R>): Observable<R> {
    return new Observable<R>((sink: Observer<R>) => {
      return operator(this, sink);
    });
  }
}

We can also get rid of that non-constant anonymous function that binds the operator function to the source and sink. All we have to do is broaden the scope of our design to include the Observable prototype’s subscribe method. We can assign the source and operator to the Observable being returned, then invoke the operator function inside the subscribe method:

class Observable<T> {
  lift<R>(operator: Operator<T, R>): Observable<R> {
    const observable = new Observable<R>();
    // lookups for these properties later would be faster if
    // these assignments happened in the Observable constructor
    observable.source = this;
    observable.operator = operator;
    return observable;
  }
  subscribe(observerOrNext?: PartialObserver<T> | ((value: T) => void), error?: (error: any) => void, complete?: () => void): Subscription {
    const { operator } = this;
    const sink = toSubscriber(observerOrNext, error, complete);
    if (operator) {
      operator.call(sink, this.source);
    } else {
      sink.add(this.source ? this._subscribe(sink) : this._trySubscribe(sink));
    }
    return sink;
  }

Even if turbofan is faster than crankshaft at invoking anonymous functions, the closure approach has 1 more allocation (aka 2x as many total allocations) than assigning the source and operator to the Observable instance being returned.

lift vs. pipe

Now let’s compare the “pure” lift and pipe signatures:

// I'm intentionally ignoring pipe's multiple operator function args,
// since we could redefine lift to also take multiple operator functions
type Operator = <T, R>(sink: Observer<R>) => Observer<T>
type lift = <T, R>(src: Observable<T>, op: Operator) => Observable<R>;
type pipe = <T, R>(op: Operator) => (src: Observable<T>) => Observable<R>

Looking closely, pipe is just the partially-applied form of lift! Pipe just re-arranges the position of the source Observable<T> and operator function, allowing the source Observable to be supplied at the end instead of the beginning. They’re so close that we wouldn’t even need two methods if we were in a functional language that partially applies functions by default.

From this perspective, the whole “lift + operators implemented in terms of Observers, + Operator classes” scheme is a thing we found that let us efficiently implement the pipe-style for operators internally before the VMs started helping us out.

I should note, I did take some liberties with the operator definitions above:

pipe’s operator function maps an Observable<T> to an Observable<R>
lift’s operator function maps an Observer<R> to an Observer<T>

This is just another way to represent the idea of either:

building an Observable chain down from the source to the sink
or building an Observer chain up from the sink to the source

I guess I’m unclear on what removing lift entirely would mean? Taking it off the prototype? Inlining the contents of the lift method into every operator? Reimplementing operators in terms of Observables (and anonymous Observers + Disposables)?

While I believe it’s always valuable to re-evaluate our assumptions on performance, my intuition is that implementing operators in terms of Observers (or rather, Subscribers) is still faster than implementing them in terms of Observables. All the operators are currently still implemented in terms of Observers and pipe is just a wrapper around lift, so today we see no difference. With the obvious caveat that JITs are spooky, I’m dubious about any plans to reimplement the operators without some form of lift.

@benlesh Either way, the promise of lift as a centralized point to hook everything never came through, and in fact there were always some operators that just didn’t use lift so it wouldn’t have been perfect.

Unless something’s changed, I don’t think this has been true since #1941 landed?

@cartant to what extent was using lift to return instances with additional methods or properties encouraged? Is it likely to be a widespread practice?

I use it in a number of public projects, and a good number of proprietary projects as well. I know a number of other folks who rely on it to extend Observable and build custom asynchronous DSLs, but they’re understandably not very active in our GH issues.

1reaction

benleshcommented, Nov 15, 2017

Thanks for commenting on this @trxcllnt, I remember almost all of this from our original work, given our concerns with size, I’m trying to look at every avenue I can to reduce the footprint of the library. That’s the major catalyst here.

Another concern I have is that at some point, a standardized Observable will land, and it will not include a lift method, so any operators we create will not work with those Observables, because all of our operators rely on lift existing on the observable instance. I suppose we could make lift a standalone function, it could cost perf, but only when you first set up the observable, not per-next or even on per-subscription.

As of even a week ago, playing around with this, I’m pretty sure I want to keep lift regardless.

What do you think about moving off as a standalone function?

pipe is a different, and unfortunate animal. It’s mostly on the prototype for ergonomic reasons. I couldn’t think of anything prettier (that wasn’t dot-chaining). I’d love it if we could have the pipeline operator, but the TC39 seems rather anti-functional programming syntax from what I’ve been told.