An unexpected unity among methods for interpreting model predictions

Scott Lundberg; Su-In Lee

arXiv:1611.07478·cs.AI·December 9, 2016·104 cites

An unexpected unity among methods for interpreting model predictions

Scott Lundberg, Su-In Lee

PDF

Open Access

TL;DR

This paper reveals a unifying additive representation for interpreting complex model predictions, demonstrating common principles across various methods and enabling new visual explanations.

Contribution

It introduces a model-agnostic additive importance representation that unifies existing interpretation methods and provides a basis for novel visual explanations.

Findings

01

Unified interpretation framework for prediction importance

02

Optimal additive importance representation satisfying key properties

03

New visual explanation techniques based on the unified representation

Abstract

Understanding why a model made a certain prediction is crucial in many data science fields. Interpretable predictions engender appropriate trust and provide insight into how the model may be improved. However, with large modern datasets the best accuracy is often achieved by complex models even experts struggle to interpret, which creates a tension between accuracy and interpretability. Recently, several methods have been proposed for interpreting predictions from complex models by estimating the importance of input features. Here, we present how a model-agnostic additive representation of the importance of input features unifies current methods. This representation is optimal, in the sense that it is the only set of additive values that satisfies important properties. We show how we can leverage these properties to create novel visual explanations of model predictions. The thread of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification