TreeSHAP does not exact SHAP values on correlated features
See original GitHub issueAs highlighted in https://github.com/slundberg/shap/issues/2345, it seems that TreeSHAP does not compute the expected SHAP values.
The following notebook reproduces the problem with the FastTreeSHAP
implementation:
https://nbviewer.org/gist/glemaitre/9a30dd3a704675164b84d9bf7128882e
Note that it is not only a problem in the implementation but rather a problem in algorithm 1 of the original TreeSHAP paper (i.e. tree traversal)
Issue Analytics
- State:
- Created a year ago
- Comments:6
Top Results From Across the Web
shap.TreeExplainer — SHAP latest documentation
Tree SHAP is a fast and exact method to estimate SHAP values for tree models ... to decide how to handle correlated (or...
Read more >SHAP Part 3: Tree SHAP - Medium
Tree SHAP is an algorithm to compute exact SHAP values for Decision Trees based models. SHAP (SHapley Additive exPlanation) is a game ...
Read more >KernelSHAP vs TreeSHAP - Towards Data Science
The algorithm estimates SHAP values by randomly sampling feature values. The issue is that, when features are correlated, the sampled values can be...
Read more >Explaining individual predictions when features are dependent
In addition to these methods, there is a method called TreeSHAP [25] which is specially ... The exact Shapley value and the Kernel...
Read more >Problems with Shapley-value-based explanations as feature ...
Shapley values are not a natural solution to the ... calculating SHAP values on additive tree-based models such.
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Thanks @glemaitre for pointing it out. The primary goal of this
FastTreeSHAP
package is to develop a fast implementation of the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) that reproduce the results from the original implementation of the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) in the SHAP package. For those users who are currently implementing the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) in SHAP package, we aim to provide them with a more efficient way of implementing it.The potential issue you pointed out is relevant to the mathematical details of the TreeSHAP algorithm in the original paper, therefore we think it is more appropriate to discuss this issue with the original authors of TreeSHAP. If the original authors update the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) in the SHAP package, we will update our implementations on our side to make sure they produce exactly the same results.
Yes that’s correct.