question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

TreeSHAP does not exact SHAP values on correlated features

See original GitHub issue

As highlighted in https://github.com/slundberg/shap/issues/2345, it seems that TreeSHAP does not compute the expected SHAP values.

The following notebook reproduces the problem with the FastTreeSHAP implementation: https://nbviewer.org/gist/glemaitre/9a30dd3a704675164b84d9bf7128882e

Note that it is not only a problem in the implementation but rather a problem in algorithm 1 of the original TreeSHAP paper (i.e. tree traversal)

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:6

github_iconTop GitHub Comments

2reactions
jlyang1990commented, Mar 23, 2022

Thanks @glemaitre for pointing it out. The primary goal of this FastTreeSHAP package is to develop a fast implementation of the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) that reproduce the results from the original implementation of the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) in the SHAP package. For those users who are currently implementing the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) in SHAP package, we aim to provide them with a more efficient way of implementing it.

The potential issue you pointed out is relevant to the mathematical details of the TreeSHAP algorithm in the original paper, therefore we think it is more appropriate to discuss this issue with the original authors of TreeSHAP. If the original authors update the TreeSHAP algorithm (feature_perturbation=“tree_path_dependent”) in the SHAP package, we will update our implementations on our side to make sure they produce exactly the same results.

0reactions
jlyang1990commented, May 30, 2022

Yes that’s correct.

Read more comments on GitHub >

github_iconTop Results From Across the Web

shap.TreeExplainer — SHAP latest documentation
Tree SHAP is a fast and exact method to estimate SHAP values for tree models ... to decide how to handle correlated (or...
Read more >
SHAP Part 3: Tree SHAP - Medium
Tree SHAP is an algorithm to compute exact SHAP values for Decision Trees based models. SHAP (SHapley Additive exPlanation) is a game ...
Read more >
KernelSHAP vs TreeSHAP - Towards Data Science
The algorithm estimates SHAP values by randomly sampling feature values. The issue is that, when features are correlated, the sampled values can be...
Read more >
Explaining individual predictions when features are dependent
In addition to these methods, there is a method called TreeSHAP [25] which is specially ... The exact Shapley value and the Kernel...
Read more >
Problems with Shapley-value-based explanations as feature ...
Shapley values are not a natural solution to the ... calculating SHAP values on additive tree-based models such.
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found