Explain Yourself, Briefly! Self-Explaining Neural Networks with Concise Sufficient Reasons
Shahaf Bassan, Ron Eliav, Shlomit Gur

TL;DR
This paper introduces a self-supervised training method called sufficient subset training (SST) that enables neural networks to generate concise, faithful explanations as part of their predictions, overcoming computational and reliability issues of previous post-hoc methods.
Contribution
The paper presents SST, a novel training approach that integrates explanation generation into neural network training, improving efficiency and faithfulness of minimal sufficient reasons.
Findings
SST produces more succinct explanations than existing methods.
Models trained with SST maintain comparable predictive accuracy.
SST significantly reduces explanation generation time.
Abstract
*Minimal sufficient reasons* represent a prevalent form of explanation - the smallest subset of input features which, when held constant at their corresponding values, ensure that the prediction remains unchanged. Previous *post-hoc* methods attempt to obtain such explanations but face two main limitations: (1) Obtaining these subsets poses a computational challenge, leading most scalable methods to converge towards suboptimal, less meaningful subsets; (2) These methods heavily rely on sampling out-of-distribution input assignments, potentially resulting in counterintuitive behaviors. To tackle these limitations, we propose in this work a self-supervised training approach, which we term *sufficient subset training* (SST). Using SST, we train models to generate concise sufficient reasons for their predictions as an integral part of their output. Our results indicate that our framework…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
