Diagnosing AI Explanation Methods with Folk Concepts of Behavior
Alon Jacovi, Jasmijn Bastings, Sebastian Gehrmann, Yoav Goldberg,, Katja Filippova

TL;DR
This paper proposes a formal framework using folk concepts of behavior to evaluate and improve AI explanation methods, aiming to enhance human understanding and reduce misunderstandings.
Contribution
It introduces a novel folk concept-based framework for diagnosing AI explanation methods and maps existing methods to this framework for qualitative evaluation.
Findings
Many current XAI methods can be mapped to folk concepts of behavior
Identifies failure modes in explanation methods related to missing information constructs
Including missing constructs can improve explanation effectiveness
Abstract
We investigate a formalism for the conditions of a successful explanation of AI. We consider "success" to depend not only on what information the explanation contains, but also on what information the human explainee understands from it. Theory of mind literature discusses the folk concepts that humans use to understand and generalize behavior. We posit that folk concepts of behavior provide us with a "language" that humans understand behavior with. We use these folk concepts as a framework of social attribution by the human explainee - the information constructs that humans are likely to comprehend from explanations - by introducing a blueprint for an explanatory narrative (Figure 1) that explains AI behavior with these constructs. We then demonstrate that many XAI methods today can be mapped to folk concepts of behavior in a qualitative evaluation. This allows us to uncover their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
