An Evaluation of the Human-Interpretability of Explanation
Isaac Lage, Emily Chen, Jeffrey He, Menaka Narayanan, Been Kim, Sam, Gershman, Finale Doshi-Velez

TL;DR
This study investigates what makes machine learning explanations truly human-interpretable by analyzing user performance across tasks and identifying design principles for better explanations.
Contribution
It provides empirical evidence on how explanation complexity impacts interpretability and introduces regularizers to optimize explanation design.
Findings
Cognitive chunks significantly improve interpretability across tasks.
Repetition of variables has less impact than complexity on understanding.
Consistent trends suggest common principles for explanation system design.
Abstract
Recent years have seen a boom in interest in machine learning systems that can provide a human-understandable rationale for their predictions or decisions. However, exactly what kinds of explanation are truly human-interpretable remains poorly understood. This work advances our understanding of what makes explanations interpretable under three specific tasks that users may perform with machine learning systems: simulation of the response, verification of a suggested response, and determining whether the correctness of a suggested response changes under a change to the inputs. Through carefully controlled human-subject experiments, we identify regularizers that can be used to optimize for the interpretability of machine learning systems. Our results show that the type of complexity matters: cognitive chunks (newly defined concepts) affect performance more than variable repetitions, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsInterpretability
