Programs as Black-Box Explanations

Sameer Singh; Marco Tulio Ribeiro; Carlos Guestrin

arXiv:1611.07579·stat.ML·November 24, 2016·35 cites

Programs as Black-Box Explanations

Sameer Singh, Marco Tulio Ribeiro, Carlos Guestrin

PDF

Open Access

TL;DR

This paper introduces using small, interpretable programs as model-agnostic explanations for black-box classifiers, offering a flexible and intuitive alternative to traditional explanation methods.

Contribution

It proposes a novel program induction approach based on simulated annealing to generate local explanations, generalizing multiple interpretable explanation families.

Findings

01

Generated explanations are intuitive and accurate.

02

Method works on small datasets with various classifiers.

03

Prototypes demonstrate the approach's flexibility.

Abstract

Recent work in model-agnostic explanations of black-box machine learning has demonstrated that interpretability of complex models does not have to come at the cost of accuracy or model flexibility. However, it is not clear what kind of explanations, such as linear models, decision trees, and rule lists, are the appropriate family to consider, and different tasks and models may benefit from different kinds of explanations. Instead of picking a single family of representations, in this work we propose to use "programs" as model-agnostic explanations. We show that small programs can be expressive yet intuitive as explanations, and generalize over a number of existing interpretable families. We propose a prototype program induction method based on simulated annealing that approximates the local behavior of black-box classifiers around a specific prediction using random perturbations.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning

MethodsInterpretability