Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance Optimizations (including Rust/AssemblyScript -> WASM)

See original GitHub issue

This document will be a work-in-progress while figuring out a game plan for how and where efforts can be best-focused for performance increases.

I have experimented with benchmarking the squash() and sigmoid()/sigmoidDerivative() functions re-written in both Rust and AssemblyScript compiled to WASM.

Rust -> WASM showed very large (~x4.5) performance increases versus JS on Chrome with squash() given 1,000,000 inputs. On Firefox, the performance was roughly the same. Still need to benchmark Rust WASM on Node.

AssemblyScript -> WASM was ~x2.5 as performant with squash() given 1,000,000 inputs on Node. Still need to benchmark AssemblyScript WASM on browsers.

I will create a repo for the test functions and post the source code here, as well as upload screenshots from console output that have benchmark measurements soon.

@postspectacular gave some feedback and insane performance tuning on the basic implementation I had in AssemblyScript as well:

https://gist.github.com/postspectacular/3dccbfed1b753edadf1b6fee8add4808#file-03-sigmoid-simd-ts-L1

He dropped down to low-level memory/pointers, and even to SIMD instructions (yet from his comment, it seems as though there are limitations in WASM with V8 engine that dont support it quite yet). So there is obviously much room to improve when considering these benchmarks – I am not familiar with Rust, AssemblyScript, or even low-level memory concepts on the whole. Perhaps some other users might be able to give feedback or advice here?

The best way to approach this is probably:

Profile the project running, and see which functions are most used. Apply the Pareto Principle/80-20 Rule, and it is likely that a small handful of methods are called the grand majority of time. These are where efforts should be focused.
Create a way to benchmark these functions with mock data in an isolated environment, testing without this will be really difficult.
Experiment with re-writing the most used functions in Rust/AssemblyScript. Benchmark, consider results.
Look into using tooling from existing libraries to make things easier. Especially when it comes to math and matrix stuff.
Read into using GPU.js and WebGL to represent array data as shaders. Umbrella has tooling for this too, I believe.

Issue Analytics

State:
Created 4 years ago
Comments:12

Top GitHub Comments

3reactions

postspectacularcommented, Nov 3, 2019

@GavinRay97 - no worries, i just felt like I needed to contextualize these snippets some more… all good!

@MaxGraey - thanks for reminding me of wasmstudio 😃 - I’ve uploaded some of my code from that gist there too and the results there are still similar to what i’ve found in node earlier (and contradictory to yours). But the only difference between the two benchmarked fn’s is their version of exp():

https://webassembly.studio/?f=kg1rea6qvbn

(@GavinRay97 et al - once the project sandbox has initialized, first open the browser console, then press “build & run”, timing info will be shown in console only… tests take ~12 sec altogether)

The tests are currently setup to work on normalized input data, executing 500 iterations on 1 million values each. The JS glue code is in /src/main.js. The AssemblyScript in /assembly/main.ts.

On my laptop (MBP2015, Chrome 78) I’m getting:

bench(“sigmoidApproxPtr”); // 811.406005859375ms (100 iters) // 4046.489013671875ms (500 iters)

bench(“sigmoidNatPtr”); // 1504.90380859375ms (100 iters) // 7569.575927734375ms (500 iters)

Getting consistent results after multiple runs… food for thought, I think! And please, this is NO upstaging attempt, whatsoever!

2reactions

MaxGraeycommented, Nov 2, 2019

Also plz don’t use Math.pow(x, 2). Currently we can’t optimize this to x * x but it will be in future. You could absolutely safe replace x ** 2 to x * x and this increase speed even more

Top Results From Across the Web

Increase Rust and WebAssembly performance

Create a WebAssembly application with all the necessary toolchain: ... Let us add some optimizations and check the performance. Open the Cargo.toml file...

Optimizing Code — Emscripten 3.1.26-git (dev) documentation

are similar to gcc, clang, and other compilers, but also different because optimizing WebAssembly includes some additional types of optimizations.

A Tale of Performance - JavaScript, Rust, and WebAssembly

Embark on a journey of optimization with me, learning in great detail about Rust, WebAssembly, and lots more along the way.

Is WebAssembly magic performance pixie dust? - surma.dev

WebAssembly, in my perception, is also strongly associated with performance by a lot of people. It was designed to be fast, ...

Understanding the Performance of WebAssembly Applications

JIT optimization has a significant impact on JavaScript performance. However, no substantial performance increase was observed for. WebAssembly with JIT.