Learn from A Rationalist: Distilling Intermediate Interpretable Rationales
Jiayi Dai, Randy Goebel

TL;DR
This paper introduces REKD, a knowledge distillation approach that enhances the interpretability and predictive performance of rationale extraction models in neural networks by learning from a teacher model's rationales and predictions.
Contribution
It proposes a novel knowledge distillation framework for rationale extraction models that improves performance and interpretability across language and vision tasks.
Findings
REKD significantly improves student model accuracy.
Applicable to various neural network architectures.
Effective in both language and vision datasets.
Abstract
Because of the pervasive use of deep neural networks (DNNs), especially in high-stakes domains, the interpretability of DNNs has received increased attention. The general idea of rationale extraction (RE) is to provide an interpretable-by-design framework for DNNs via a select-predict architecture where two neural networks learn jointly to perform feature selection and prediction, respectively. Given only the remote supervision from the final task prediction, the process of learning to select subsets of features (or \emph{rationales}) requires searching in the space of all possible feature combinations, which is computationally challenging and even harder when the base neural networks are not sufficiently capable. To improve the predictive performance of RE models that are based on less capable or smaller neural networks (i.e., the students), we propose \textbf{REKD} (\textbf{R}ationale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Advanced Neural Network Applications
