question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

feature_importance causes a BSOD on Windows 10

See original GitHub issue

Describe the bug

Running permutation_importance on a medium-sized data set results in a BSOD on Windows 10. The dataset is 470605 x 332, code is running in a Jupyter notebook, Python version 3.7.6, scikit version 0.22.1. The BSOD is a KERNEL_SECURITY_CHECK_FAILURE, with ERROR_CODE: (NTSTATUS) 0xc0000409 - The system detected an overrun of a stack-based buffer in this application. This overrun could potentially allow a malicious user to gain control of this application. The machine has a Ryzen 5 3600 with 16GB of RAM.

Steps/Code to Reproduce

from sklearn.ensemble import RandomForestClassifier
from sklearn.inspection import permutation_importance
rf = RandomForestClassifier(n_estimators = 250,
                           n_jobs = -1,
                           oob_score = True,
                           bootstrap = True,
                           random_state = 42)
rf.fit(X_train, y_train)
permImp = permutation_importance(rf,
                                 X_val,
                                 y_val,
                                 scoring='f1',
                                 n_repeats=5,
                                 n_jobs=-1,
                                 random_state=42)

Expected Results

No BSOD, permutation importance computed.

Actual Results

BSOD after ~1-2 minutes

Versions

sklearn.show_versions()

System: python: 3.7.6 (default, Jan 8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] executable: C:\Users\lucag\anaconda3\python.exe machine: Windows-10-10.0.18362-SP0

Python dependencies: pip: 20.0.2 setuptools: 45.2.0.post20200210 sklearn: 0.22.1 numpy: 1.18.1 scipy: 1.4.1 Cython: 0.29.15 pandas: 1.0.1 matplotlib: 3.1.3 joblib: 0.14.1

Built with OpenMP: True

Issue Analytics

  • State:open
  • Created 3 years ago
  • Comments:22 (9 by maintainers)

github_iconTop GitHub Comments

1reaction
ogriselcommented, Sep 29, 2020

Actually, looking at the code, using mmaping should be fine (and useful) because joblib.Parallel is called in the non permuted data and the permutation is applied on a copy of the data in the worker.

Maybe there is something wrong and low level only happening on windows. I need to investigate with a windows VM.

1reaction
Devilmooncommented, Aug 19, 2020

Unfortunately I already removed those folders and cannot remember the exact file names, however I can tell you that they all had short file names, and all of them had the same size, which was around ~2MB

Read more comments on GitHub >

github_iconTop Results From Across the Web

Stop error or blue screen error troubleshooting - Windows Client
There's no simple explanation for the cause of stop errors (also known as blue screen errors or bug check errors).
Read more >
How to troubleshoot and fix Windows 10 blue screen errors
The Blue Screen of Death (BSoD) — also known as "blue screen," "stop error," or "system crash" — could happen after a critical...
Read more >
How to find out the cause of BSODs using the Event Viewer
1. Click on the magnifying glass in the taskbar to open the search box. You can also press Windows + S to summon...
Read more >
What is the Blue Screen of Death in Windows 10 and ... - HP
Typically, BSODs result from driver software or issues with hardware. Apps that crash sometimes cause blue screens of death if they're broken or ......
Read more >
11 Tips to Help You Fix the Windows 10 Blue Screen Error
Blue screens can happen for many reasons, which we'll dig into below. Common BSOD causes include bad drivers, problems with hardware, and operating...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found