Segmentation fault in libiomp with pytorch / regression from 0.24.2
See original GitHub issueDescribe the bug
When running a simple program to fit some values, python crashes inside libiomp, if and only if torch
is imported before scikit-learn
and version 1.0 of 'scikit-learn` is used.
Steps/Code to Reproduce
from random import randint
# Comment the next line to avoid the segfault
import torch
from sklearn.cluster import KMeans
X = [[randint(0, j) for j in range(1000)] for i in range(1000)]
kmeans = KMeans(n_clusters=4)
kmeans.fit(X)
print(kmeans.labels_)
Expected Results
The sample code shouldn’t crash if torch
is imported before scikit-learn
and print the labels as shown below (import torch
has been removed to get this result):
(.venv) (base) ➜ test_seg_fault $ python main.py
[0 2 0 1 3 1 3 0 2 0 0 0 1 1 2 0 1 1 1 0 2 1 2 0 3 0 3 3 1 3 0 1 0 1 3 1 1
2 0 3 1 0 3 2 0 0 2 0 3 3 0 2 1 0 2 1 0 3 0 3 1 0 1 1 0 0 0 1 3 1 1 0 2 0
3 3 1 2 0 2 3 3 2 0 2 1 3 2 0 2 1 3 1 1 3 0 3 3 0 0 0 2 0 1 3 3 0 0 2 0 1
0 3 2 3 1 3 0 3 1 2 3 1 3 0 3 3 0 1 3 1 2 1 3 3 1 1 2 3 2 1 0 3 3 3 0 1 1
1 3 0 3 2 0 2 3 2 2 2 3 3 0 0 0 0 2 3 2 3 0 0 3 2 2 1 0 0 3 0 2 1 3 3 0 3
0 1 0 1 3 3 1 1 0 1 2 3 1 1 3 0 0 0 3 3 0 2 0 1 3 1 0 2 3 0 1 3 3 2 1 1 0
0 1 3 0 0 1 2 1 2 3 2 1 1 3 2 1 0 0 0 0 3 0 0 0 0 2 3 2 3 2 2 0 3 0 2 0 2
1 0 1 3 0 2 3 0 1 1 0 0 2 2 1 3 0 0 1 3 3 0 0 2 3 1 0 0 1 1 2 1 0 3 1 0 1
1 0 1 2 0 2 1 3 0 3 0 0 1 1 0 0 0 3 3 1 3 0 3 1 3 3 0 3 1 1 1 3 1 0 2 0 1
2 0 0 2 0 1 1 1 3 1 3 2 1 1 3 3 2 3 2 3 0 1 3 3 1 0 3 1 1 1 3 3 2 3 2 1 2
0 3 3 3 3 0 1 1 3 3 3 0 2 2 3 2 2 0 3 0 3 2 3 3 0 0 3 0 0 2 1 1 0 2 0 2 0
2 2 0 1 3 0 0 2 0 0 1 3 3 0 0 0 1 3 2 3 0 2 2 0 1 0 3 3 2 0 2 3 0 1 1 0 3
0 2 0 3 1 3 2 3 1 0 2 2 1 0 3 1 2 3 0 2 2 2 1 3 3 3 3 0 0 3 0 3 2 2 3 0 3
1 3 2 3 1 2 3 1 3 3 0 2 0 0 3 2 2 2 1 1 2 0 1 3 3 2 1 1 3 0 0 1 2 0 1 2 0
0 3 1 0 0 1 1 1 3 2 1 0 1 2 3 0 0 3 1 0 0 2 3 3 1 2 3 1 2 3 2 3 3 3 3 2 2
1 0 2 1 2 1 1 1 1 3 3 2 3 1 3 0 0 1 3 0 1 2 0 1 0 2 2 3 3 0 0 0 2 3 1 2 2
1 2 1 3 0 0 0 3 1 3 0 2 0 2 3 3 1 2 1 3 0 0 2 3 1 1 0 1 0 0 3 1 1 1 0 2 1
1 3 1 0 0 0 0 1 0 3 0 0 0 3 2 1 3 0 3 3 1 3 2 0 0 3 3 2 0 0 2 0 1 3 0 0 0
1 3 3 2 2 1 3 3 2 3 2 0 1 0 0 3 2 0 1 0 1 3 2 1 0 3 3 3 1 2 1 2 1 3 0 0 0
3 3 0 1 2 3 0 3 2 2 1 3 0 3 3 1 1 1 0 0 1 0 0 0 0 3 3 1 0 3 3 3 2 0 0 2 2
0 1 0 3 2 1 3 0 1 1 2 1 0 1 3 1 0 0 1 3 1 1 3 1 0 3 2 1 2 1 0 2 2 2 2 2 0
1 0 3 0 0 0 3 0 3 1 1 0 0 0 2 3 1 1 0 2 3 3 3 3 2 1 0 2 0 3 3 3 1 0 0 1 0
0 1 2 2 3 2 2 3 1 1 1 1 3 1 3 1 0 1 3 2 0 0 0 1 2 3 1 2 0 3 1 3 2 0 3 3 0
0 2 1 2 0 1 1 0 3 3 3 2 3 1 3 0 0 1 1 2 3 1 1 1 3 2 3 3 0 1 0 0 2 1 1 1 1
3 3 3 3 1 3 1 3 1 3 1 0 3 1 0 1 1 1 1 1 2 2 0 3 3 0 0 3 1 0 1 3 0 3 2 3 3
1 2 1 3 0 0 2 3 1 1 2 1 0 3 0 3 1 3 3 2 2 2 1 2 0 1 3 1 3 0 3 1 3 3 1 1 0
1 3 0 2 1 2 3 2 1 0 1 3 2 3 1 2 0 3 1 3 1 3 2 3 0 2 0 3 3 2 3 1 1 0 3 3 0
1]
Actual Results
The program fails with a segmentation fault. I managed to extract the stack trace with lldb (on mac OS):
(.venv) (base) ➜ test_seg_fault $ lldb ./.venv/bin/python
(lldb) target create "./.venv/bin/python"
Current executable set to './.venv/bin/python' (x86_64).
(lldb) run main.py
Process 70127 launched: './.venv/bin/python' (x86_64)
Process 70127 stopped
* thread #2, stop reason = exec
frame #0: 0x000000010000e000 dyld`_dyld_start
dyld`_dyld_start:
-> 0x10000e000 <+0>: popq %rdi
0x10000e001 <+1>: pushq $0x0
0x10000e003 <+3>: movq %rsp, %rbp
0x10000e006 <+6>: andq $-0x10, %rsp
Target 0: (Python) stopped.
(lldb) continue
Process 70127 resuming
Process 70127 stopped
* thread #17, stop reason = EXC_BAD_ACCESS (code=1, address=0x48)
frame #0: 0x000000013012aa6c libiomp5.dylib`void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 28
libiomp5.dylib`__kmp_suspend_64<false, true>:
-> 0x13012aa6c <+28>: movq (%rdx,%rdi,8), %r13
0x13012aa70 <+32>: movq %r13, %rdi
0x13012aa73 <+35>: callq 0x13011d570 ; __kmp_suspend_initialize_thread
0x13012aa78 <+40>: leaq 0x5c0(%r13), %r14
thread #18, stop reason = EXC_BAD_ACCESS (code=1, address=0x50)
frame #0: 0x000000013012aa6c libiomp5.dylib`void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 28
libiomp5.dylib`__kmp_suspend_64<false, true>:
-> 0x13012aa6c <+28>: movq (%rdx,%rdi,8), %r13
0x13012aa70 <+32>: movq %r13, %rdi
0x13012aa73 <+35>: callq 0x13011d570 ; __kmp_suspend_initialize_thread
0x13012aa78 <+40>: leaq 0x5c0(%r13), %r14
thread #19, stop reason = EXC_BAD_ACCESS (code=1, address=0x58)
frame #0: 0x000000013012aa6c libiomp5.dylib`void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 28
libiomp5.dylib`__kmp_suspend_64<false, true>:
-> 0x13012aa6c <+28>: movq (%rdx,%rdi,8), %r13
0x13012aa70 <+32>: movq %r13, %rdi
0x13012aa73 <+35>: callq 0x13011d570 ; __kmp_suspend_initialize_thread
0x13012aa78 <+40>: leaq 0x5c0(%r13), %r14
thread #20, stop reason = EXC_BAD_ACCESS (code=1, address=0x60)
frame #0: 0x000000013012aa6c libiomp5.dylib`void __kmp_suspend_64<false, true>(int, kmp_flag_64<false, true>*) + 28
libiomp5.dylib`__kmp_suspend_64<false, true>:
-> 0x13012aa6c <+28>: movq (%rdx,%rdi,8), %r13
0x13012aa70 <+32>: movq %r13, %rdi
0x13012aa73 <+35>: callq 0x13011d570 ; __kmp_suspend_initialize_thread
0x13012aa78 <+40>: leaq 0x5c0(%r13), %r14
Target 0: (Python) stopped.
(lldb)
Versions
>>> import sklearn; sklearn.show_versions()
System:
python: 3.9.7 (default, Sep 3 2021, 12:37:55) [Clang 12.0.5 (clang-1205.0.22.9)]
executable: /Users/abderraouf.elgasser/projects/iktos/experiments/test_seg_fault_ndevaux/.venv/bin/python
machine: macOS-11.4-x86_64-i386-64bit
Python dependencies:
pip: 21.2.4
setuptools: 57.4.0
sklearn: 1.0
numpy: 1.21.1
scipy: 1.6.1
Cython: None
pandas: None
matplotlib: None
joblib: 1.1.0
threadpoolctl: 3.0.0
Built with OpenMP: True
Tested with Python 3.7.9, 3.7.12 and 3.9.7, torch 1.7.0 and 1.9.1
It doesn’t crash in any of these environments with version 0.24.2 of scikit-learn
Tested on Mac OS 11.4
Issue Analytics
- State:
- Created 2 years ago
- Comments:6 (3 by maintainers)
Top Results From Across the Web
Segmentation fault and there are no infomation about this error
Hi, I have some issues which I am not able to solve. A segmentation fault happens when I run this project in brach...
Read more >Segmentation fault - PyTorch Forums
Hi,ptrblck, I solved this problem. look into this code, and it is a function of a sampler of Dataloader. def iter(self) returns iter(torch....
Read more >Segmentation Fault bias initialisation Conv2d - PyTorch Forums
Hi!, I face a problem using pytorch 1.3.0 on Cuda V100. Here the code originating from and associated paper ...
Read more >[solved] Segmentation fault (core dump) - PyTorch Forums
Hi, I'm running into an segmentation fault error while running my code. ... libiomp5.so Try: yum --enablerepo='*debug*' install ...
Read more >Segmentation Fault when importing PyTorch
When I tried to import PyTorch in python, it crashed with a segfault error: “Segmentation fault (core dumped)” is all I have about...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
The fix will be included in
1.0.1
. I do not have a exact timeline on that release, but I suspect it will be soon.This can be related to https://github.com/scikit-learn/scikit-learn/issues/21182 where we used libomp 12 to build the osx wheel.
Can you see if you get this error by installing the nightly build: