On marginal feature attributions of tree-based models

Khashayar Filom; Alexey Miroshnikov; Konstandinos Kotsiopoulos; Arjun; Ravi Kannan

arXiv:2302.08434·cs.LG·May 9, 2024·1 cites

On marginal feature attributions of tree-based models

Khashayar Filom, Alexey Miroshnikov, Konstandinos Kotsiopoulos, Arjun, Ravi Kannan

PDF

Open Access 1 Repo

TL;DR

This paper analyzes marginal feature attribution methods for tree-based models, highlighting differences with TreeSHAP, leveraging model structure for efficient computation, and providing explicit formulas for certain models like CatBoost.

Contribution

It introduces a theoretical comparison of marginal and path-dependent attributions, and derives an explicit, efficient formula for marginal Shapley values in CatBoost models.

Findings

01

TreeSHAP can give different feature rankings for identical functions.

02

Marginal feature attributions depend only on the model's input-output function.

03

Explicit formulas enable fast computation of attributions for certain models.

Abstract

Due to their power and ease of use, tree-based machine learning models, such as random forests and gradient-boosted tree ensembles, have become very popular. To interpret them, local feature attributions based on marginal expectations, e.g. marginal (interventional) Shapley, Owen or Banzhaf values, may be employed. Such methods are true to the model and implementation invariant, i.e. dependent only on the input-output function of the model. We contrast this with the popular TreeSHAP algorithm by presenting two (statistically similar) decision trees that compute the exact same function for which the "path-dependent" TreeSHAP yields different rankings of features, whereas the marginal Shapley values coincide. Furthermore, we discuss how the internal structure of tree-based models may be leveraged to help with computing their marginal feature attributions according to a linear game value.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

filomkhash/tree-based-paper
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Sports Analytics and Performance · Machine Learning and Data Classification