TL;DR
This paper investigates how different additive explanation methods characterize non-additive components in black-box models, revealing their differences, accuracy, and practical usability for practitioners.
Contribution
It provides a comparative analysis of explanation methods for non-additive models, introducing concepts of main and total effects and evaluating their effectiveness.
Findings
Distilled explanations are most accurate among additive methods.
Non-additive explanations like tree explanations can be more accurate.
Practitioners prefer additive explanations for practical tasks.
Abstract
Many methods to explain black-box models, whether local or global, are additive. In this paper, we study global additive explanations for non-additive models, focusing on four explanation methods: partial dependence, Shapley explanations adapted to a global setting, distilled additive explanations, and gradient-based explanations. We show that different explanation methods characterize non-additive components in a black-box model's prediction function in different ways. We use the concepts of main and total effects to anchor additive explanations, and quantitatively evaluate additive and non-additive explanations. Even though distilled explanations are generally the most accurate additive explanations, non-additive explanations such as tree explanations that explicitly model non-additive components tend to be even more accurate. Despite this, our user study showed that machine learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
