Evaluating Input Feature Explanations through a Unified Diagnostic   Evaluation Framework

Jingyi Sun; Pepa Atanasova; Isabelle Augenstein

arXiv:2406.15085·cs.CL·February 10, 2025

Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework

Jingyi Sun, Pepa Atanasova, Isabelle Augenstein

PDF

Open Access 1 Video

TL;DR

This paper introduces a unified framework for comparing different input feature explanation methods in machine learning, revealing that interactive span explanations generally outperform other types across various diagnostic properties.

Contribution

The authors develop a comprehensive framework for directly comparing highlight and interactive explanations, enabling systematic evaluation across multiple diagnostic properties.

Findings

01

Interactive span explanations outperform other explanation types in most diagnostic properties.

02

Different explanation methods have distinct strengths depending on the diagnostic property.

03

The study highlights the need for further research to improve and combine explanation techniques.

Abstract

Explaining the decision-making process of machine learning models is crucial for ensuring their reliability and transparency for end users. One popular explanation form highlights key input features, such as i) tokens (e.g., Shapley Values and Integrated Gradients), ii) interactions between tokens (e.g., Bivariate Shapley and Attention-based methods), or iii) interactions between spans of the input (e.g., Louvain Span Interactions). However, these explanation types have only been studied in isolation, making it difficult to judge their respective applicability. To bridge this gap, we develop a unified framework that facilitates an automated and direct comparison between highlight and interactive explanations comprised of four diagnostic properties. We conduct an extensive analysis across these three types of input feature explanations -- each utilizing three different explanation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Evaluating Input Feature Explanations through a Unified Diagnostic Evaluation Framework· underline

Taxonomy

TopicsFace and Expression Recognition