question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Reliable function inlining

See original GitHub issue

scala-js: 0.6.27

I want to abstract over a piece of performance critical code. With the help of function inlining I could do everything I need. But sometimes some functions are not inlined. I don’t see a pattern when a function will be inlined and when not.

Simple example:

@inline def foo1(
  bar: (() => Boolean) => Boolean,
  ):Boolean = {
    println("sideeffect")
    bar(() => false) && bar(() => false)
}

println("foo1" + foo1( bar = _()))

will be properly inlined by fastOptJS:

var this$27 = $m_s_Console$();
var this$28 = $as_Ljava_io_PrintStream(this$27.outVar$2.v$1);
this$28.java$lang$JSConsoleBasedPrintStream$$printString__T__V("sideeffect\n");
var x = ("foo1" + false);
var this$30 = $m_s_Console$();
var this$31 = $as_Ljava_io_PrintStream(this$30.outVar$2.v$1);
this$31.java$lang$JSConsoleBasedPrintStream$$printString__T__V((x + "\n"));

But this example:

@inline def foo2(
  bar: (() => Boolean) => Boolean,
  ):Boolean = {
    println("sideeffect")
    if(bar(() => false)) 5 else 7
    bar(() => false)
}
println("foo2" + foo2( bar = _()))

Is not:

var f = (function(this$2$1) {
return (function(x$2$2) {
  var x$2 = $as_F0(x$2$2);
  return $uZ(x$2.apply__O())
})
})(this);
var this$33 = $m_s_Console$();
var this$34 = $as_Ljava_io_PrintStream(this$33.outVar$2.v$1);
this$34.java$lang$JSConsoleBasedPrintStream$$printString__T__V("sideeffect\n");
var arg1$3 = new $c_sjsr_AnonFunction0().init___sjs_js_Function0((function($this) {
return (function() {
  return false
})
})(this));
$uZ(f(arg1$3));
var arg1$4 = new $c_sjsr_AnonFunction0().init___sjs_js_Function0((function(this$2$2) {
return (function() {
  return false
})
})(this));
var x$1 = ("foo2" + $uZ(f(arg1$4)));
var this$36 = $m_s_Console$();
var this$37 = $as_Ljava_io_PrintStream(this$36.outVar$2.v$1);
this$37.java$lang$JSConsoleBasedPrintStream$$printString__T__V((x$1 + "\n"));

Why is inlining not working in the second example? It’s obviously not about the function signature, but somehow the usage of the arguments in the function body.

A more complex example, the one I’m actually working on. (I replaced custom array data structures with mutable.Stack and mutable.HashMap to reduce code size and make it a self-contained example):

@inline def depthFirstSearchGeneric[PROCESSRESULT](
    vertexCount: Int,
    foreachSuccessor: (Int, Int => Unit) => Unit, // (idx, f) => successors(idx).foreach(f)
    init: (Int => Unit, collection.mutable.Stack[Int]) => Unit, // (enqueue,_) => enqueue(start)
    processVertex: Int => PROCESSRESULT, // result += _
    loopConditionGuard: (() => Boolean) => Boolean = condition => condition(),
    advanceGuard: (PROCESSRESULT, () => Unit) => Unit =
      (result: PROCESSRESULT, advance: () => Unit) => advance(),
    enqueueGuard: (Int, () => Unit) => Unit = (elem, enqueue) => enqueue()
): Unit = {
  val stack = new collection.mutable.Stack[Int] // ArrayStackInt.create(capacity = vertexCount)
  val visited = new collection.mutable.HashSet[Int] // ArraySet.create(vertexCount)

  @inline def enqueue(elem: Int): Unit = {
    enqueueGuard(elem, { () =>
      stack.push(elem)
      visited += elem
    })
  }

  init(enqueue, stack)
  while (loopConditionGuard(() => !stack.isEmpty)) {
    val current = stack.pop()
    visited += current

    advanceGuard(
      processVertex(current),
      () =>
        foreachSuccessor(current, { next =>
          if (!visited.contains(next)) {
            enqueue(next)
          }
        })
    )
  }
}

val edges = Array(Array[Int](1), Array[Int](0))
depthFirstSearchGeneric(
  edges.size,
  (idx, f) => edges(idx).foreach(f),
  init = (enqueue, _) => enqueue(0),
  processVertex = v => println(v)
)

Which produces:

// val edges = Array(Array[Int](1), Array[Int](0))
var array = [1];
var xs = new $c_sjs_js_WrappedArray().init___sjs_js_Array(array);
var len = $uI(xs.array$6.length);
var array$1 = $newArrayObject($d_I.getArrayOf(), [len]);
var elem$1 = 0;
elem$1 = 0;
var this$13 = new $c_sc_IndexedSeqLike$Elements().init___sc_IndexedSeqLike__I__I(xs, 0, $uI(xs.array$6.length));
while (this$13.hasNext__Z()) {
var arg1 = this$13.next__O();
array$1.set(elem$1, $uI(arg1));
elem$1 = ((1 + elem$1) | 0)
};
var array$2 = [0];
var xs$1 = new $c_sjs_js_WrappedArray().init___sjs_js_Array(array$2);
var len$1 = $uI(xs$1.array$6.length);
var array$3 = $newArrayObject($d_I.getArrayOf(), [len$1]);
var elem$1$1 = 0;
elem$1$1 = 0;
var this$20 = new $c_sc_IndexedSeqLike$Elements().init___sc_IndexedSeqLike__I__I(xs$1, 0, $uI(xs$1.array$6.length));
while (this$20.hasNext__Z()) {
var arg1$1 = this$20.next__O();
array$3.set(elem$1$1, $uI(arg1$1));
elem$1$1 = ((1 + elem$1$1) | 0)
};
var array$4 = [array$1, array$3];
var xs$2 = new $c_sjs_js_WrappedArray().init___sjs_js_Array(array$4);
var len$2 = $uI(xs$2.array$6.length);
var array$5 = $newArrayObject($d_I.getArrayOf().getArrayOf(), [len$2]);
var elem$1$2 = 0;
elem$1$2 = 0;
var this$28 = new $c_sc_IndexedSeqLike$Elements().init___sc_IndexedSeqLike__I__I(xs$2, 0, $uI(xs$2.array$6.length));
while (this$28.hasNext__Z()) {
var arg1$2 = this$28.next__O();
array$5.set(elem$1$2, arg1$2);
elem$1$2 = ((1 + elem$1$2) | 0)
};
// the four function arguments, that are not inlined.
// (vertexCount, init and processVertex are inlined)
var foreachSuccessor = new $c_sjsr_AnonFunction2().init___sjs_js_Function2((function($this, edges) {
return (function(idx$2, f$2) {
  var idx = $uI(idx$2);
  var f = $as_F1(f$2);
  var xs$3 = edges.get(idx);
  var i = 0;
  var len$3 = xs$3.u.length;
  while ((i < len$3)) {
    var idx$1 = i;
    f.apply__O__O(xs$3.get(idx$1));
    i = ((1 + i) | 0)
  }
})
})(this, array$5));
var loopConditionGuard = this.depthFirstSearchGeneric$default$5$1__p1__F1();
var advanceGuard = this.depthFirstSearchGeneric$default$6$1__p1__F2();
var enqueueGuard = this.depthFirstSearchGeneric$default$7$1__p1__F2();
// function body
var stack = new $c_scm_Stack().init___();
var visited = new $c_scm_HashSet().init___();
enqueueGuard.apply__O__O__O(0, new $c_sjsr_AnonFunction0().init___sjs_js_Function0((function($this$1, stack$1, elem, visited$1) {
return (function() {
  stack$1.push__O__scm_Stack(elem);
  visited$1.$$plus$eq__O__scm_HashSet(elem)
})
})(this, stack, 0, visited)));
while ($uZ(loopConditionGuard.apply__O__O(new $c_sjsr_AnonFunction0().init___sjs_js_Function0((function(this$2$1, stack$2) {
return (function() {
  return (!stack$2.elems$5.isEmpty__Z())
})
})(this, stack))))) {
var current = $uI(stack.pop__O());
visited.$$plus$eq__O__scm_HashSet(current);
var this$35 = $m_s_Console$();
var this$36 = $as_Ljava_io_PrintStream(this$35.outVar$2.v$1);
this$36.java$lang$JSConsoleBasedPrintStream$$printString__T__V((current + "\n"));
advanceGuard.apply__O__O__O((void 0), new $c_sjsr_AnonFunction0().init___sjs_js_Function0((function(this$3$1, foreachSuccessor$1, current$1, visited$2, enqueueGuard$1, stack$3) {
  return (function() {
    foreachSuccessor$1.apply__O__O__O(current$1, new $c_sjsr_AnonFunction1().init___sjs_js_Function1((function($this$2, visited$1$1, enqueueGuard$1$1, stack$1$1) {
      return (function(next$2) {
        var next = $uI(next$2);
        if ((!$f_scm_FlatHashTable__containsElem__O__Z(visited$1$1, next))) {
          enqueueGuard$1$1.apply__O__O__O(next, new $c_sjsr_AnonFunction0().init___sjs_js_Function0((function($this$3, stack$1$2, elem$2, visited$1$2) {
            return (function() {
              stack$1$2.push__O__scm_Stack(elem$2);
              visited$1$2.$$plus$eq__O__scm_HashSet(elem$2)
            })
          })($this$2, stack$1$1, next, visited$1$1)))
        }
      })
    })(this$3$1, visited$2, enqueueGuard$1, stack$3)))
  })
})(this, foreachSuccessor, current, visited, enqueueGuard, stack)))
};

First, I was expecting everything to be inlined. Then I thought that maybe the Closure Compiler or JIT would optimize these cases, but I checked the generated code and benchmarked: They don’t optimize how I want them to. There is a measurable performance hit when the functions are not inlined.

So my alternative solutions are:

  1. Hard-code all needed variations by hand. This is where I’m coming from. I had like ~15 variations of this function and made lots of mistakes. It was very unmaintainable.
  2. Write a macro, which I would like to avoid.

Are there any other options I didn’t think of? Is it possible to tweak my function, so that all arguments are inlined?

Sorry for this very long issue and thanks for your help! 😅

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Reactions:1
  • Comments:5 (4 by maintainers)

github_iconTop GitHub Comments

2reactions
sjrdcommented, Apr 25, 2019

I managed to get everything inlined the way you want with the following variant. But it’s awkward because I had to duplicate the enqueueGuard param, and two lambdas doing the same thing would need to be passed as actual arguments if you want to use something else than the default at call site:

  @inline private def loopConditionGuardDefault: (() => Boolean) => Boolean =
    condition => condition()

  @inline private def advanceGuardDefault[PROCESSRESULT]: (PROCESSRESULT, () => Unit) => Unit =
    (result: PROCESSRESULT, advance: () => Unit) => advance()

  @inline private def enqueueGuardDefault: (Int, () => Unit) => Unit =
    (elem, enqueue) => enqueue()

  @inline def depthFirstSearchGeneric[PROCESSRESULT](
      vertexCount: Int,
      foreachSuccessor: (Int, Int => Unit) => Unit, // (idx, f) => successors(idx).foreach(f)
      init: (Int => Unit, collection.mutable.Stack[Int]) => Unit, // (enqueue,_) => enqueue(start)
      processVertex: Int => PROCESSRESULT, // result += _
      loopConditionGuard: (() => Boolean) => Boolean = loopConditionGuardDefault,
      advanceGuard: (PROCESSRESULT, () => Unit) => Unit = advanceGuardDefault,
      enqueueGuard1: (Int, () => Unit) => Unit = enqueueGuardDefault,
      enqueueGuard2: (Int, () => Unit) => Unit = enqueueGuardDefault
  ): Unit = {
    val stack = new collection.mutable.Stack[Int] // ArrayStackInt.create(capacity = vertexCount)
    val visited = new collection.mutable.HashSet[Int] // ArraySet.create(vertexCount)

    @inline def enqueue1(elem: Int): Unit = {
      enqueueGuard1(elem, { () =>
        stack.push(elem)
        visited += elem
      })
    }

    @inline def enqueue2(elem: Int): Unit = {
      enqueueGuard2(elem, { () =>
        stack.push(elem)
        visited += elem
      })
    }

    init(enqueue1, stack)
    while (loopConditionGuard(() => !stack.isEmpty)) {
      val current = stack.pop()
      visited += current

      advanceGuard(
        processVertex(current),
        () =>
          foreachSuccessor(current, { next =>
            if (!visited.contains(next)) {
              enqueue2(next)
            }
          })
      )
    }
  }
1reaction
sjrdcommented, May 15, 2019

I’m going to close this, as I don’t think there’s anything actionable for this repo.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Inline expansion - Wikipedia
In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the...
Read more >
Is inlining almost all of my C++ application's methods a good ...
An inline function (or variable) effectively disables the one-definition rule (ODR): If the linker sees multiple conflicting definitions for the ...
Read more >
Do inline functions improve performance?, C++ FAQ
Yes and no. Sometimes. Maybe. There are no simple answers. inline functions might make the code faster, they might make it slower.
Read more >
A Deeper Look at Inline Functions - ACCU.org
The most reliable way to see if a function is being inlined or not is to look at the output from the compiler....
Read more >
When to use the inline function and when not to use it?
very small functions are good candidates for inline : faster code and smaller executables (more chances to stay in the code cache); the...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found