question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

BUG: openblas 0.3.9 shipped with the scipy 1.7.1 wheel can cause segfaults when it fails to detect the correct CPU architecture (Prescott instead o Haswell)

See original GitHub issue

Describe your issue.

Recent Intel CPUs detected as SkylakeX by recent versions of OpenBLAS are wrongfully detected as Prescott by OpenBLAS 0.3.9 shipped with scipy 1.7.1.

This in turns can cause segfaults when calling the ddot_k_PRESCOTT routine on a readonly memory buffer allocated by numpy.memmap with mode="r".

EDIT: the more probable cause is that the data is not memory aligned in this test case.

See: https://github.com/scikit-learn/scikit-learn/issues/21361 for details an reproducing code using the scikit-learn test suite.

Error message

(gdb) bt
#0  0x00007fffdbe31c31 in ddot_k_PRESCOTT () from /home/aperez/dev/sandbox/test/scikit-learn/venv/lib/python3.8/site-packages/scipy/spatial/../../scipy.libs/libopenblasp-r0-085ca80a.3.9.so
#1  0x00007fffc893b39a in ?? () from /home/aperez/dev/sandbox/test/scikit-learn/venv/lib/python3.8/site-packages/scipy/linalg/cython_blas.cpython-38-x86_64-linux-gnu.so
#2  0x00007fffc89226e0 in ?? () from /home/aperez/dev/sandbox/test/scikit-learn/venv/lib/python3.8/site-packages/scipy/linalg/cython_blas.cpython-38-x86_64-linux-gnu.so
#3  0x00007fffbe547aca in __pyx_fuse_1__pyx_f_7sklearn_5utils_12_cython_blas__dot (__pyx_v_n=<optimized out>, __pyx_v_x=<optimized out>, __pyx_v_incx=<optimized out>, __pyx_v_y=<optimized out>, 
    __pyx_v_incy=<optimized out>) at sklearn/utils/_cython_blas.c:2861
#4  0x00007fffbe258327 in svm::Kernel::Kernel (this=0x7fffffff6030, l=<optimized out>, x_=<optimized out>, param=..., blas_functions=0x7fffffff64e0) at sklearn/svm/src/libsvm/svm.cpp:394
#5  0x00007fffbe25b394 in svm::SVC_Q::SVC_Q (blas_functions=0x7fffffff64e0, y_=0x555556ff0fd0 '\001' <repeats 100 times>, '\377' <repeats 100 times>..., param=..., prob=..., this=0x7fffffff6030)
    at sklearn/svm/src/libsvm/svm.cpp:1682
#6  svm::solve_nu_svc (blas_functions=0x7fffffff64e0, si=0x7fffffff6000, alpha=0x555557450160, param=0x7fffffff69a0, prob=0x7fffffff6360) at sklearn/svm/src/libsvm/svm.cpp:1682
#7  svm::svm_train_one (prob=0x7fffffff6360, param=0x7fffffff69a0, Cp=<optimized out>, Cn=<optimized out>, status=0x7fffffff64d4, blas_functions=0x7fffffff64e0)
    at sklearn/svm/src/libsvm/svm.cpp:1856
#8  0x00007fffbe264cde in svm_train (prob=0x7fffffff6340, prob@entry=0x7fffffff6500, param=param@entry=0x7fffffff69a0, status=status@entry=0x7fffffff64d4, 
    blas_functions=blas_functions@entry=0x7fffffff64e0) at sklearn/svm/src/libsvm/svm.cpp:2504
#9  0x00007fffbe23e921 in __pyx_pf_7sklearn_3svm_7_libsvm_fit (__pyx_v_X=__pyx_v_X@entry=0x7fffbb041db0, __pyx_v_Y=__pyx_v_Y@entry=0x7fffbb041c90, __pyx_v_svm_type=__pyx_v_svm_type@entry=1, 
    __pyx_v_kernel=__pyx_v_kernel@entry=0x7fffc3052730, __pyx_v_degree=__pyx_v_degree@entry=3, __pyx_v_gamma=__pyx_v_gamma@entry=0.53188777537536391, __pyx_v_coef0=__pyx_v_coef0@entry=0, 
    __pyx_v_tol=__pyx_v_tol@entry=0.001, __pyx_v_C=__pyx_v_C@entry=0, __pyx_v_nu=0.5, __pyx_v_epsilon=0, __pyx_v_class_weight=__pyx_v_class_weight@entry=0x7fffbb041f30, 
    __pyx_v_sample_weight=0x7fffbb041e10, __pyx_v_shrinking=1, __pyx_v_probability=0, __pyx_v_cache_size=200, __pyx_v_max_iter=-1, __pyx_v_random_seed=209652396, __pyx_self=<optimized out>)
[...]

SciPy/NumPy/Python version information

1.7.1 1.21.2 3.8.2

Issue Analytics

  • State:closed
  • Created 2 years ago
  • Reactions:3
  • Comments:7 (7 by maintainers)

github_iconTop GitHub Comments

2reactions
ogriselcommented, Nov 18, 2021

The architecture detection bug was already fixed upstream.

The fact that the prescott kernel can segfault is probably related to the fact that the scikit-learn test is using non-memory aligned buffer. The fact that it’s readonly is probably unrelated.

Given that prescott is a really old architecture and that very few people are likely to call OpenBLAS routines on non-memory aligned data, I think it’s not worth wasting time trying to this issue. But still let me report it, just in case OpenBLAS dev are interested in it. Anyways I think we can close the scipy issue.

2reactions
ogriselcommented, Oct 21, 2021

Note that https://github.com/scipy/scipy/blob/master/tools/openblas_support.py has already been updated to use OpenBLAS 0.3.17 and the nightly wheels already ship it:

https://anaconda.org/scipy-wheels-nightly/scipy/files

Read more comments on GitHub >

github_iconTop Results From Across the Web

Identify what's causing segmentation faults (segfaults)
A segmentation fault (aka segfault) is a common condition that ... "Array out of bounds" error valid indices for array foo are 0,...
Read more >
latest PDF - EasyBuild Documentation
EasyBuild consists of a collection of Python modules and packages that interact with each other, dynamically picking.
Read more >
2014-December.txt - sourceware.org
After the first iteration of the loop is completed, this error is thrown: 0 [main] ... But even if you choose the hard...
Read more >
bug #57591: Segmentation faults when running the test suite
I have some occasional segafaults in odd places during make check on AMD/ryzen systems when I used default openblas library (none of them...
Read more >
Why is a segmentation fault not recoverable? - Stack Overflow
When exactly does segmentation fault happen (=when is SIGSEGV sent)?. When you attempt to access memory you don't have access to, ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found