Diagnostics-Guided Explanation Generation
Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle, Augenstein

TL;DR
This paper introduces a method to directly optimize explanation quality using diagnostic properties like Faithfulness, Data Consistency, and Confidence Indication, leading to better explanations and task performance without needing human annotations.
Contribution
It proposes a novel training approach that explicitly optimizes multiple diagnostic properties for explanation generation models, improving their quality and alignment with human rationales.
Findings
Enhanced explanation quality and human agreement.
Improved downstream task performance.
Effective optimization of diagnostic properties.
Abstract
Explanations shed light on a machine learning model's rationales and can aid in identifying deficiencies in its reasoning process. Explanation generation models are typically trained in a supervised way given human explanations. When such annotations are not available, explanations are often selected as those portions of the input that maximise a downstream task's performance, which corresponds to optimising an explanation's Faithfulness to a given model. Faithfulness is one of several so-called diagnostic properties, which prior work has identified as useful for gauging the quality of an explanation without requiring annotations. Other diagnostic properties are Data Consistency, which measures how similar explanations are for similar input instances, and Confidence Indication, which shows whether the explanation reflects the confidence of the model. In this work, we show how to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Machine Learning and Data Classification
