Assessing the Local Interpretability of Machine Learning Models
Dylan Slack, Sorelle A. Friedler, Carlos Scheidegger, and Chitradeep, Dutta Roy

TL;DR
This study evaluates human understanding of machine learning models' interpretability through user tasks, confirming that simpler models like decision trees are more interpretable than neural networks, using a new interpretability metric.
Contribution
The paper introduces a user study testing interpretability definitions and proposes runtime operation count as a metric for local interpretability of models.
Findings
Participants perform better on simpler models
Operation count correlates with interpretability
Decision trees and logistic regression are more interpretable than neural networks
Abstract
The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on two definitions of interpretability that have been introduced in the machine learning literature: simulatability (a user's ability to run a model on a given input) and "what if" local explainability (a user's ability to correctly determine a model's prediction under local changes to the input, given knowledge of the model's original prediction). Through a user study with 1,000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable. To track the relative interpretability of models, we employ a simple metric,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
MethodsInterpretability · Logistic Regression
