Explanations are a Means to an End: Decision Theoretic Explanation Evaluation

Ziyang Guo; Berk Ustun; Jessica Hullman

arXiv:2506.22740·cs.AI·February 24, 2026

Explanations are a Means to an End: Decision Theoretic Explanation Evaluation

Ziyang Guo, Berk Ustun, Jessica Hullman

PDF

Open Access

TL;DR

This paper introduces a decision theoretic framework for evaluating explanations based on their expected improvement on specific decision tasks, providing theoretical benchmarks and practical assessment methods.

Contribution

It proposes a novel decision theoretic approach to explanation evaluation, linking explanations directly to decision-making performance and introducing three distinct evaluative metrics.

Findings

01

The framework defines theoretical, human-complementary, and behavioral explanation values.

02

Applied to human-AI decision support, it quantifies explanation potential.

03

Validated in mechanistic interpretability contexts.

Abstract

Explanations of model behavior are commonly evaluated via proxy properties weakly tied to the purposes explanations serve in practice. We contribute a decision theoretic framework that treats explanations as information signals valued by the expected improvement they enable on a specified decision task. This approach yields three distinct estimands: 1) a theoretical benchmark that upperbounds achievable performance by any agent with the explanation, 2) a human-complementary value that quantifies the theoretically attainable value that is not already captured by a baseline human decision policy, and 3) a behavioral value representing the causal effect of providing the explanation to human decision-makers. We instantiate these definitions in a practical validation workflow, and apply them to assess explanation potential and interpret behavioral effects in human-AI decision support and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Embodied and Extended Cognition