Segmentation fault (core dumped) for shap_values
See original GitHub issueHi,
I’m trying to apply the TreeExplainer to get shap_values on XGBoost model on regression problem with large dataset. During the hyperparameter tuning, it failed due to segmentation fault at the explainer.shap_values()
step in certain hyperparameter sets. I used fasttreeshap=0.1.1, xgboost=1.4.1 (also tested 1.6.0) and the machine came with CPU:“Intel Xeon E5-2640 v4 (20) @ 3.400GHz” and Memory:“128GB”. The sample code below is a toy script to reproduce the issue using the Superconductor dataset from example notebook:
# for debugging
import faulthandler
faulthandler.enable()
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
import xgboost as xgb
import fasttreeshap
print(f"XGBoost version: {xgb.__version__}")
print(f"fasttreeshap version: {fasttreeshap.__version__}")
# source of data: https://archive.ics.uci.edu/ml/datasets/superconductivty+data
data = pd.read_csv("FastTreeSHAP/data/superconductor_train.csv", engine = "python")
train, test = train_test_split(data, test_size = 0.5, random_state = 0)
label_train = train["critical_temp"]
label_test = test["critical_temp"]
train = train.iloc[:, :-1]
test = test.iloc[:, :-1]
print("train XGBoost model")
xgb_model = xgb.XGBRegressor(
max_depth = 100, n_estimators = 200, learning_rate = 0.1, n_jobs = -1, alpha = 0.12, random_state = 0)
xgb_model.fit(train, label_train)
print("run TreeExplainer()")
shap_explainer = fasttreeshap.TreeExplainer(xgb_model)
print("run shap_values()")
shap_values = shap_explainer.shap_values(train)
The time
report of program execution also showed that the “Maximum resident set size” was only about 32GB.
~$ /usr/bin/time -v python segfault.py
XGBoost version: 1.4.1
fasttreeshap version: 0.1.1
train XGBoost model
run TreeExplainer()
run shap_values()
Fatal Python error: Segmentation fault
Thread 0x00007ff2c2793740 (most recent call first):
File "~/.local/lib/python3.8/site-packages/fasttreeshap/explainers/_tree.py", line 459 in shap_values
File "segfault.py", line 27 in <module>
Segmentation fault (core dumped)
Command terminated by signal 11
Command being timed: "python segfault.py"
User time (seconds): 333.65
System time (seconds): 27.79
Percent of CPU this job got: 797%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:45.30
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 33753096
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 0
Minor (reclaiming a frame) page faults: 8188488
Voluntary context switches: 3048
Involuntary context switches: 3089
Swaps: 0
File system inputs: 0
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
In some case (and the example above), forcing TreeExplainer(algorithm="v1")
did help, which means the issue could only happen to “v2” (or “auto” passing the _memory_check()
). However, by chance the v1 would raise another check_additivity issue which remained unsolved in the original algorithm.
Alternatively, passing approximate=True
to explainer.shap_values()
would work but have the inconsistency concerns for the reproducibility of our studies…
In this case, could you help me to debug with this issue?
Thanks you so much!
Issue Analytics
- State:
- Created a year ago
- Comments:8
Top GitHub Comments
Hi Shaun,
Thanks for pointing out this issue. I have checked the code, and it seems that the function
_memory_check()
doesn’t produce the correct result when the maximum tree depth is very large due to numerical errors. This leads to the out-of-memory issue when settingalgorithm=v2
orauto
(_memory_check()
should figure out the out-of-memory issue for algorithm v2, and automatically switch to algorithm v1). I have fixed this issue by the latest commit https://github.com/linkedin/FastTreeSHAP/commit/fa8531502553ad5d3e3dfb9dce97a86acad41b1c.Let me know if you still have the out-of-memory issue. Thanks!
Thanks Shaun so much for the detailed description of your experiment settings, and the table with very detailed quantitative results! Really happy to see that
fasttreeshap
has helped mitigate the numerical precision issues in your project. Let me know if there is anything else I can help with, and good luck with your project! 😃