TL;DR
TreeGrad-Ranker introduces an efficient gradient-based method for feature ranking in decision trees, improving reliability and performance over traditional probabilistic value approaches.
Contribution
The paper proposes TreeGrad-Ranker, a novel gradient-based algorithm that directly optimizes feature ranking objectives with $O(L)$ complexity, surpassing probabilistic value methods.
Findings
TreeGrad-Ranker outperforms existing methods on insertion and deletion metrics.
TreeGrad-Shap provides numerically stable Shapley value computations.
Linear TreeShap can have errors up to 10^{15} times larger than TreeGrad-Shap.
Abstract
We revisit the use of probabilistic values, which include the well-known Shapley and Banzhaf values, to rank features for explaining the local predicted values of decision trees. The quality of feature rankings is typically assessed with the insertion and deletion metrics. Empirically, we observe that co-optimizing these two metrics is closely related to a joint optimization that selects a subset of features to maximize the local predicted value while minimizing it for the complement. However, we theoretically show that probabilistic values are generally unreliable for solving this joint optimization. Therefore, we explore deriving feature rankings by directly optimizing the joint objective. As the backbone, we propose TreeGrad, which computes the gradients of the multilinear extension of the joint objective in time for decision trees with leaves; these gradients include…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
