How to Explain Individual Classification Decisions

David Baehrens; Timon Schroeter; Stefan Harmeling; Motoaki Kawanabe,; Katja Hansen; Klaus-Robert Mueller

arXiv:0912.1128·stat.ML·December 8, 2009·760 cites

How to Explain Individual Classification Decisions

David Baehrens, Timon Schroeter, Stefan Harmeling, Motoaki Kawanabe,, Katja Hansen, Klaus-Robert Mueller

PDF

Open Access

TL;DR

This paper introduces a procedure to explain individual classification decisions from any machine learning model, addressing the gap left by traditional methods like decision trees that only provide global explanations.

Contribution

It proposes a novel method that under certain assumptions can elucidate why a classifier predicts a specific label for a single data point, regardless of the model type.

Findings

01

Provides a general framework for local explanations

02

Applicable to any classification model

03

Enhances interpretability of machine learning predictions

Abstract

After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted the particular label for a single instance and what features were most influential for that particular instance. The only method that is currently able to provide such explanations are decision trees. This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Anomaly Detection Techniques and Applications · Machine Learning and Data Classification