EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models

Sichao Li; Tommy Liu; Quanling Deng; Amanda S. Barnard

arXiv:2411.01956·cs.LG·November 18, 2025

EXAGREE: Mitigating Explanation Disagreement with Stakeholder-Aligned Models

Sichao Li, Tommy Liu, Quanling Deng, Amanda S. Barnard

PDF

Open Access

TL;DR

EXAGREE is a framework that resolves explanation disagreements in machine learning models by selecting stakeholder-aligned explanations, improving faithfulness, plausibility, and fairness without sacrificing accuracy.

Contribution

It introduces a novel two-stage method combining differentiable attribution and sorting to select models aligned with stakeholder explanations, addressing explanation conflict issues.

Findings

01

Improves faithfulness, plausibility, and fairness over baselines

02

Maintains task accuracy while enhancing explanation quality

03

Demonstrates robustness across six real-world datasets

Abstract

Conflicting explanations, arising from different attribution methods or model internals, limit the adoption of machine learning models in safety-critical domains. We turn this disagreement into an advantage and introduce EXplanation AGREEment (EXAGREE), a two-stage framework that selects a Stakeholder-Aligned Explanation Model (SAEM) from a set of similar-performing models. The selection maximizes Stakeholder-Machine Agreement (SMA), a single metric that unifies faithfulness and plausibility. EXAGREE couples a differentiable mask-based attribution network (DMAN) with monotone differentiable sorting, enabling gradient-based search inside the constrained model space. Experiments on six real-world datasets demonstrate simultaneous gains of faithfulness, plausibility, and fairness over baselines, while preserving task accuracy. Extensive ablation studies, significance tests, and case…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Scientific Computing and Data Management · Machine Learning in Healthcare

MethodsSparse Evolutionary Training