question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Provide SSE-optimized JNI functions

See original GitHub issue

Previously, I did some testing on doing matrix calculations in native functions via SSE/AVX instructions: https://github.com/JOML-CI/JOML/issues/30

That turned out to be slower than the calculations with standard scalar arithmetic operations in Java. This was due to the approach of “batching” all operations. In order for that batching to work, the operands to each method invocations as well as the opcodes of each operation had to be stored in native memory for the native function to decode/read and execute. That storing and reading of the opcodes and operands was the major bottleneck.

Now, there is another more promising approach: Not batching the operations but simply directly calling a JNI function to do the job with optimized SSE instructions. Initial testing showed major performance increases. See the JMH results below:

joml-array (using a float[16]):

Benchmark                             Mode  Cnt          Score         Error  Units
Matrix4fBenchmarks.testInvert        thrpt    3   24760865,260 ±  910609,284  ops/s
Matrix4fBenchmarks.testMul           thrpt    3   34555251,163 ±  183270,652  ops/s
Matrix4fBenchmarks.testMulAffine     thrpt    3   52189265,415 ±  622020,725  ops/s

joml-jni (using native memory and JNI functions):

Benchmark                             Mode  Cnt          Score          Error  Units
Matrix4fBenchmarks.testInvert        thrpt    3   36367075,182 ±   283999,981  ops/s
Matrix4fBenchmarks.testMul           thrpt    3   70239126,361 ±    27891,033  ops/s
Matrix4fBenchmarks.testMulAffine     thrpt    3   76090662,949 ±   342632,179  ops/s

Work on intrinsifying all heavy/important JOML methods has started in the jni branch based off the array branch.

Issue Analytics

  • State:closed
  • Created 7 years ago
  • Reactions:2
  • Comments:10 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
httpdigestcommented, May 10, 2016

Sure thing! Btw.: these are the only methods that actually benefit from native functions. For every other method the JNI overhead of 19-22 clock cycles (on a 64-bit “server” HotSpot JVM, measured with RDTSC instruction and empty JNI function) is just too high, resulting in the pure Java version to be faster. If it wasn’t for JNI’s overhead, hand-written SIMD native code would outrun every Java method. Waiting for project Panama…

1reaction
httpdigestcommented, May 9, 2016

Here are some benchmark results on an i7 with JDK1.8.0_92:

Benchmark         Mode  Cnt   Score   Error  Units
CopyPojo          avgt    5   8,549 ± 0,525  ns/op
CopyUnsafe        avgt    5   7,222 ± 0,171  ns/op
IdentityPojo      avgt    5   6,111 ± 0,270  ns/op
IdentityUnsafe    avgt    5   5,427 ± 0,096  ns/op
InvertAVX         avgt    5  24,900 ± 5,570  ns/op
InvertAffinePojo  avgt    5  22,575 ± 0,304  ns/op
InvertPojo        avgt    5  38,973 ± 0,784  ns/op
InvertSSE         avgt    5  25,105 ± 0,645  ns/op
MulAVX            avgt    5  13,695 ± 0,727  ns/op
MulAffineAVX      avgt    5  11,378 ± 0,252  ns/op
MulAffinePojo     avgt    5  16,445 ± 0,138  ns/op
MulAffineSSE      avgt    5  13,002 ± 3,352  ns/op
MulPojo           avgt    5  25,888 ± 0,344  ns/op
MulSSE            avgt    5  15,251 ± 0,432  ns/op
ZeroPojo          avgt    5   6,186 ± 0,348  ns/op
ZeroUnsafe        avgt    5   5,060 ± 0,080  ns/op

Pojo means the normal Matrix4f Java version with the 16 primitive float fields, Unsafe means that there was sun.misc.Unsafe used for faster copying, SSE means JNI function using x86 SSE-128bit, AVX means JNI function using x86 AVX1-128bit.

Read more comments on GitHub >

github_iconTop Results From Across the Web

JNI Functions
It provides a complete listing of all the JNI functions. ... The JNIEnv type is a pointer to a structure storing all JNI...
Read more >
Java Native Interface (JNI) - Java Programming Tutorial
The JNI Environment interface provides many functions to do the conversion. JNI is a C interface, which is not object-oriented. It does not...
Read more >
JNI tips | Android NDK
The JNIEnv provides most of the JNI functions. Your native functions all receive a JNIEnv as the first argument.
Read more >
Guide to JNI (Java Native Interface) - Baeldung
Java provides the native keyword that's used to indicate that the method implementation will be provided by a native code.
Read more >
Accessing JNI services - IBM
The Java Native Interface (JNI) provides many callable services that you can use when you develop applications that mix COBOL and Java.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found