Towards a Unified Framework for Evaluating Explanations

Juan D. Pinto; Luc Paquette

arXiv:2405.14016·cs.LG·July 16, 2024

Towards a Unified Framework for Evaluating Explanations

Juan D. Pinto, Luc Paquette

PDF

Open Access

TL;DR

This paper reviews how interpretability is evaluated in ML and HCI, proposing a unified framework that emphasizes faithfulness, intelligibility, plausibility, and stability for explanations.

Contribution

It introduces a unified evaluation framework for interpretability, clarifying relationships between criteria and integrating perspectives from ML and HCI communities.

Findings

01

Identifies overlaps and misalignments in existing evaluation methods.

02

Proposes relationships between explanation criteria like faithfulness and intelligibility.

03

Illustrates the framework with examples from neural network interpretability study.

Abstract

The challenge of creating interpretable models has been taken up by two main research communities: ML researchers primarily focused on lower-level explainability methods that suit the needs of engineers, and HCI researchers who have more heavily emphasized user-centered approaches often based on participatory design methods. This paper reviews how these communities have evaluated interpretability, identifying overlaps and semantic misalignments. We propose moving towards a unified framework of evaluation criteria and lay the groundwork for such a framework by articulating the relationships between existing criteria. We argue that explanations serve as mediators between models and stakeholders, whether for intrinsically interpretable models or opaque black-box models analyzed via post-hoc techniques. We further argue that useful explanations require both faithfulness and intelligibility.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvaluation and Performance Assessment · Scientific Computing and Data Management