Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance
Omer Reingold, Judy Hanwen Shen, Aditi Talati

TL;DR
This paper introduces dissenting explanations, which leverage disagreements among models to help humans better understand predictions and reduce overreliance on potentially incorrect explanations, without sacrificing accuracy.
Contribution
It proposes the concept of dissenting explanations, explores their benefits in model multiplicity, and develops methods for generating global and local dissenting explanations.
Findings
Dissenting explanations reduce overreliance on model predictions.
They do not decrease overall model accuracy.
Pilot studies support their utility in interpretability.
Abstract
While explainability is a desirable characteristic of increasingly complex black-box models, modern explanation methods have been shown to be inconsistent and contradictory. The semantics of explanations is not always fully understood - to what extent do explanations "explain" a decision and to what extent do they merely advocate for a decision? Can we help humans gain insights from explanations accompanying correct predictions and not over-rely on incorrect predictions advocated for by explanations? With this perspective in mind, we introduce the notion of dissenting explanations: conflicting predictions with accompanying explanations. We first explore the advantage of dissenting explanations in the setting of model multiplicity, where multiple models with similar performance may have different predictions. In such cases, providing dissenting explanations could be done by invoking the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Scientific Computing and Data Management · Machine Learning in Materials Science
