SIGILL from libopenblas.so on Skylake CPU inside VMWare
See original GitHub issueIssue Description
My tests work locally, both with and without Docker. I get the following error when running tests on a build server however:
44562 [DEBUG] Forking command line: /bin/sh -c cd /builds/s/mmm/mmm.signal && /usr/lib/jvm/java-11-openjdk-amd64/bin/java -jar /builds/s/mmm/mmm.signal/target/surefire/surefirebooter3171066246014079146.jar /builds/s/mmm/mmm.signal/target/surefire 2019-03-18T15-27-37_667-jvmRun1 surefire7278448647545906416tmp surefire_213468214502336623584tmp
45279 [INFO] Running com.e.s.mmm.signal.SssTest
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
46414 [INFO] Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.13 s - in com.e.s.mmm.sss.SssTest
46414 [INFO] Running com.e.s.mmm.signal.SssFilterTest
46732 [WARNING] Corrupted STDOUT by directly writing to native stream in forked JVM 1. See FAQ web page and the dump file /builds/s/mmm/mmm.signal/target/surefire-reports/2019-03-18T15-27-37_667-jvmRun1.dumpstream
46733 [DEBUG] #
46733 [DEBUG] # A fatal error has been detected by the Java Runtime Environment:
46734 [DEBUG] #
46734 [DEBUG] # SIGILL (0x4) at pc=0x00007fd2372dc68c, pid=864, tid=865
46735 [DEBUG] #
46735 [DEBUG] # JRE version: OpenJDK Runtime Environment (11.0.2+9) (build 11.0.2+9-Debian-3bpo91)
46736 [DEBUG] # Java VM: OpenJDK 64-Bit Server VM (11.0.2+9-Debian-3bpo91, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
46736 [DEBUG] # Problematic frame:
46737 [DEBUG] # C [libopenblas.so.0+0x107f68c] ssymv_L_SKYLAKEX+0x4c
46737 [DEBUG] #
46738 [DEBUG] # Core dump will be written. Default location: Core dumps may be processed with "/usr/libexec/abrt-hook-ccpp %s %c %p %u %g %t e %P %I %h" (or dumping to /builds/s/mmm/mmm.signal/core.864)
46738 [DEBUG] #
46739 [DEBUG] # An error report file with more information is saved as:
46739 [DEBUG] # /builds/s/mmm/mmm.signal/hs_err_pid864.log
46740 [DEBUG] Compiled method (c1) 2153 2154 3 sun.nio.cs.UTF_8$Encoder::encodeLoop (28 bytes)
46740 [DEBUG] total in heap [0x00007fd2d532e390,0x00007fd2d532eb10] = 1920
46740 [DEBUG] relocation [0x00007fd2d532e508,0x00007fd2d532e578] = 112
46741 [DEBUG] main code [0x00007fd2d532e580,0x00007fd2d532e940] = 960
46741 [DEBUG] stub code [0x00007fd2d532e940,0x00007fd2d532e9f8] = 184
46741 [DEBUG] metadata [0x00007fd2d532e9f8,0x00007fd2d532ea10] = 24
46742 [DEBUG] scopes data [0x00007fd2d532ea10,0x00007fd2d532ea60] = 80
46742 [DEBUG] scopes pcs [0x00007fd2d532ea60,0x00007fd2d532eaf0] = 144
46742 [DEBUG] dependencies [0x00007fd2d532eaf0,0x00007fd2d532eaf8] = 8
46743 [DEBUG] nul chk table [0x00007fd2d532eaf8,0x00007fd2d532eb10] = 24
46743 [DEBUG] #
46743 [DEBUG] # If you would like to submit a bug report, please visit:
46744 [DEBUG] # http://bugreport.java.com/bugreport/crash.jsp
46744 [DEBUG] # The crash happened outside the Java Virtual Machine in native code.
46745 [DEBUG] # See problematic frame for where to report the bug.
46745 [DEBUG] #
The setup is as follows: pom.xml that includes dl4j-core and nd4j-platform.
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-core</artifactId>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-native-platform</artifactId>
</dependency>
Failing test source:
@Test
@DisplayName("Nd4j is available")
void testSymmetricGeneralizedEigenvaluesAvailableInBLAS() {
// Simple test to verify that nd4j is actually available,
// and is able to use a BLAS wrapper to compute eigenvalues of
// a symmetric matrix.
INDArray M = Nd4j.create(new float[]{
-2,-2, 4,
-2, 1, 2,
4, 2, 5
}, new int[]{3,3});
Eigen.symmetricGeneralizedEigenvalues(M);
// ignored return value contains eigenvalues
// M now contains eigenvectors as columns
}
Version Information
- Deeplearning4j version 1.0.0-beta3)
- Platform: VMWare hypervisor (ESXi)
- Guest OS: Red Hat Enterprise Linux Server release 7.6 (Maipo) guest OS
- Docker image running tests: openjdk:11-jdk
$ docker run -it --rm openjdk:11-jdk bash
root@86b5c7cd0147:/# java --version
openjdk 11.0.2 2019-01-15
OpenJDK Runtime Environment (build 11.0.2+9-Debian-3bpo91)
OpenJDK 64-Bit Server VM (build 11.0.2+9-Debian-3bpo91, mixed mode, sharing)
CPU info:
processor : 5
vendor_id : GenuineIntel
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz
stepping : 4
microcode : 0x2000043
cpu MHz : 2294.609
cache size : 16896 KB
physical id : 10
siblings : 1
core id : 0
cpu cores : 1
apicid : 10
initial apicid : 10
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 invpcid rtm rdseed adx smap xsaveopt arat
bogomips : 4589.21
clflush size : 64
cache_alignment : 64
address sizes : 42 bits physical, 48 bits virtual
power management:
It works on my local CPU, with the following specs. (+also Red Hat 7.6).
processor : 7
vendor_id : GenuineIntel
cpu family : 6
model : 158
model name : Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
stepping : 9
microcode : 0x8e
cpu MHz : 899.877
cache size : 8192 KB
physical id : 0
siblings : 8
core id : 3
cpu cores : 4
apicid : 7
initial apicid : 7
fpu : yes
fpu_exception : yes
cpuid level : 22
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp spec_ctrl intel_stibp flush_l1d
bogomips : 5808.00
clflush size : 64
cache_alignment : 64
address sizes : 39 bits physical, 48 bits virtual
power management:
PS: I haven’t consciously chosen to use Openblas, it appears to be the default when using the environment described above (Redhat+Docker)
Issue Analytics
- State:
- Created 5 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Illegal instruction when importing numpy · Issue #60 - GitHub
Thread 1 "python" received signal SIGILL, Illegal instruction. ... now the segfault happens in libcblas.so.3 instead of libopenblas.so.0 .
Read more >assembly - Rdrand instrucrtion SIGILL - Stack Overflow
When the migration is complete, you will access your Teams at stackoverflowteams.com, and they will no longer appear in the left sidebar on ......
Read more >Kaby Lake Virtualized Perfomance Counter support?
Hey all, was curious if the new Kaby lake processor supports Virtualized performance counters. 've been having some difficulties getting ...
Read more >Debian Bug report logs - #930482 openblas: uses AVX-512 ...
Debian Bug report logs - #930482 openblas: uses AVX-512 even when AVX-512 is not available in a VM. version graph. Package: libopenblas- ...
Read more >Bug List - FreeBSD Bugzilla
ID Product Component Assignee△ Status△ Changed
266552 Ports & Packages Individual Port(s) horde Open 2022‑10‑17
262794 Ports & Packages Individual Port(s) thierry New 2022‑04‑10
261555 Ports...
Read more >
Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free
Top Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
The reason why it’s probably not using MKL by default is because MKLML doesn’t contain most of LAPACK so just to be safe it doesn’t use it for LAPACK functions by default, but you can try to force it by setting the “org.bytedeco.javacpp.openblas.load” system property to “mklml”. Please let me know if that works or not for your application.
Thanks! That works. 👍
I’ll make sure to follow the bytedeco/javacpp changelog from now on.