Learn from A Rationalist: Distilling Intermediate Interpretable Rationales

Jiayi Dai; Randy Goebel

arXiv:2601.22531·cs.LG·February 2, 2026

Learn from A Rationalist: Distilling Intermediate Interpretable Rationales

Jiayi Dai, Randy Goebel

PDF

Open Access

TL;DR

This paper introduces REKD, a knowledge distillation approach that enhances the interpretability and predictive performance of rationale extraction models in neural networks by learning from a teacher model's rationales and predictions.

Contribution

It proposes a novel knowledge distillation framework for rationale extraction models that improves performance and interpretability across language and vision tasks.

Findings

01

REKD significantly improves student model accuracy.

02

Applicable to various neural network architectures.

03

Effective in both language and vision datasets.

Abstract

Because of the pervasive use of deep neural networks (DNNs), especially in high-stakes domains, the interpretability of DNNs has received increased attention. The general idea of rationale extraction (RE) is to provide an interpretable-by-design framework for DNNs via a select-predict architecture where two neural networks learn jointly to perform feature selection and prediction, respectively. Given only the remote supervision from the final task prediction, the process of learning to select subsets of features (or \emph{rationales}) requires searching in the space of all possible feature combinations, which is computationally challenging and even harder when the base neural networks are not sufficiently capable. To improve the predictive performance of RE models that are based on less capable or smaller neural networks (i.e., the students), we propose \textbf{REKD} (\textbf{R}ationale…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Multimodal Machine Learning Applications · Advanced Neural Network Applications