feature_importance causes a BSOD on Windows 10
See original GitHub issueDescribe the bug
Running permutation_importance on a medium-sized data set results in a BSOD on Windows 10. The dataset is 470605 x 332, code is running in a Jupyter notebook, Python version 3.7.6, scikit version 0.22.1.
The BSOD is a KERNEL_SECURITY_CHECK_FAILURE, with ERROR_CODE: (NTSTATUS) 0xc0000409 - The system detected an overrun of a stack-based buffer in this application. This overrun could potentially allow a malicious user to gain control of this application.
The machine has a Ryzen 5 3600 with 16GB of RAM.
Steps/Code to Reproduce
from sklearn.ensemble import RandomForestClassifier
from sklearn.inspection import permutation_importance
rf = RandomForestClassifier(n_estimators = 250,
n_jobs = -1,
oob_score = True,
bootstrap = True,
random_state = 42)
rf.fit(X_train, y_train)
permImp = permutation_importance(rf,
X_val,
y_val,
scoring='f1',
n_repeats=5,
n_jobs=-1,
random_state=42)
Expected Results
No BSOD, permutation importance computed.
Actual Results
BSOD after ~1-2 minutes
Versions
sklearn.show_versions()
System: python: 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\lucag\anaconda3\python.exe machine: Windows-10-10.0.18362-SP0
Python dependencies: pip: 20.0.2 setuptools: 45.2.0.post20200210 sklearn: 0.22.1 numpy: 1.18.1 scipy: 1.4.1 Cython: 0.29.15 pandas: 1.0.1 matplotlib: 3.1.3 joblib: 0.14.1
Built with OpenMP: True
Issue Analytics
- State:
- Created 3 years ago
- Comments:22 (9 by maintainers)
Actually, looking at the code, using mmaping should be fine (and useful) because joblib.Parallel is called in the non permuted data and the permutation is applied on a copy of the data in the worker.
Maybe there is something wrong and low level only happening on windows. I need to investigate with a windows VM.
Unfortunately I already removed those folders and cannot remember the exact file names, however I can tell you that they all had short file names, and all of them had the same size, which was around ~2MB