question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Performance Optimizations (including Rust/AssemblyScript -> WASM)

See original GitHub issue

This document will be a work-in-progress while figuring out a game plan for how and where efforts can be best-focused for performance increases.

I have experimented with benchmarking the squash() and sigmoid()/sigmoidDerivative() functions re-written in both Rust and AssemblyScript compiled to WASM.

Rust -> WASM showed very large (~x4.5) performance increases versus JS on Chrome with squash() given 1,000,000 inputs. On Firefox, the performance was roughly the same. Still need to benchmark Rust WASM on Node.

AssemblyScript -> WASM was ~x2.5 as performant with squash() given 1,000,000 inputs on Node. Still need to benchmark AssemblyScript WASM on browsers.

I will create a repo for the test functions and post the source code here, as well as upload screenshots from console output that have benchmark measurements soon.

@postspectacular gave some feedback and insane performance tuning on the basic implementation I had in AssemblyScript as well:

https://gist.github.com/postspectacular/3dccbfed1b753edadf1b6fee8add4808#file-03-sigmoid-simd-ts-L1

He dropped down to low-level memory/pointers, and even to SIMD instructions (yet from his comment, it seems as though there are limitations in WASM with V8 engine that dont support it quite yet). So there is obviously much room to improve when considering these benchmarks – I am not familiar with Rust, AssemblyScript, or even low-level memory concepts on the whole. Perhaps some other users might be able to give feedback or advice here?

The best way to approach this is probably:

  • Profile the project running, and see which functions are most used. Apply the Pareto Principle/80-20 Rule, and it is likely that a small handful of methods are called the grand majority of time. These are where efforts should be focused.
  • Create a way to benchmark these functions with mock data in an isolated environment, testing without this will be really difficult.
  • Experiment with re-writing the most used functions in Rust/AssemblyScript. Benchmark, consider results.
  • Look into using tooling from existing libraries to make things easier. Especially when it comes to math and matrix stuff.
  • Read into using GPU.js and WebGL to represent array data as shaders. Umbrella has tooling for this too, I believe.

Issue Analytics

  • State:open
  • Created 4 years ago
  • Comments:12

github_iconTop GitHub Comments

3reactions
postspectacularcommented, Nov 3, 2019

@GavinRay97 - no worries, i just felt like I needed to contextualize these snippets some more… all good!

@MaxGraey - thanks for reminding me of wasmstudio 😃 - I’ve uploaded some of my code from that gist there too and the results there are still similar to what i’ve found in node earlier (and contradictory to yours). But the only difference between the two benchmarked fn’s is their version of exp():

https://webassembly.studio/?f=kg1rea6qvbn

(@GavinRay97 et al - once the project sandbox has initialized, first open the browser console, then press “build & run”, timing info will be shown in console only… tests take ~12 sec altogether)

The tests are currently setup to work on normalized input data, executing 500 iterations on 1 million values each. The JS glue code is in /src/main.js. The AssemblyScript in /assembly/main.ts.

On my laptop (MBP2015, Chrome 78) I’m getting:

bench(“sigmoidApproxPtr”); // 811.406005859375ms (100 iters) // 4046.489013671875ms (500 iters)

bench(“sigmoidNatPtr”); // 1504.90380859375ms (100 iters) // 7569.575927734375ms (500 iters)

Getting consistent results after multiple runs… food for thought, I think! And please, this is NO upstaging attempt, whatsoever!

2reactions
MaxGraeycommented, Nov 2, 2019

Also plz don’t use Math.pow(x, 2). Currently we can’t optimize this to x * x but it will be in future. You could absolutely safe replace x ** 2 to x * x and this increase speed even more

Read more comments on GitHub >

github_iconTop Results From Across the Web

Increase Rust and WebAssembly performance
Create a WebAssembly application with all the necessary toolchain: ... Let us add some optimizations and check the performance. Open the Cargo.toml file...
Read more >
Optimizing Code — Emscripten 3.1.26-git (dev) documentation
are similar to gcc, clang, and other compilers, but also different because optimizing WebAssembly includes some additional types of optimizations.
Read more >
A Tale of Performance - JavaScript, Rust, and WebAssembly
Embark on a journey of optimization with me, learning in great detail about Rust, WebAssembly, and lots more along the way.
Read more >
Is WebAssembly magic performance pixie dust? - surma.dev
WebAssembly, in my perception, is also strongly associated with performance by a lot of people. It was designed to be fast, ...
Read more >
Understanding the Performance of WebAssembly Applications
JIT optimization has a significant impact on JavaScript performance. However, no substantial performance increase was observed for. WebAssembly with JIT.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found