Explaining Explanations in AI
Brent Mittelstadt, Chris Russell, Sandra Wachter

TL;DR
This paper discusses the limitations of simplified models in AI interpretability, emphasizing the importance of understanding explanations beyond these models and exploring broader philosophical perspectives to improve AI explanations.
Contribution
It highlights the distinction between interpretability models and explanations, proposing a broader view of explanations in AI inspired by philosophy and sociology.
Findings
Simplified models are useful but limited as explanations.
Different philosophical schools offer diverse perspectives on what constitutes an explanation.
Broader approaches could enhance AI interpretability and explanation methods.
Abstract
Recent work on interpretability in machine learning and AI has focused on the building of simplified models that approximate the true criteria used to make decisions. These models are a useful pedagogical device for teaching trained professionals how to predict what decisions will be made by the complex system, and most importantly how the system might break. However, when considering any such model it's important to remember Box's maxim that "All models are wrong but some are useful." We focus on the distinction between these models and explanations in philosophy and sociology. These models can be understood as a "do it yourself kit" for explanations, allowing a practitioner to directly answer "what if questions" or generate contrastive explanations without external assistance. Although a valuable ability, giving these models as explanations appears more difficult than necessary, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Computational and Text Analysis Methods
MethodsInterpretability
