Distribution Matching for Rationalization
Yongfeng Huang, Yujun Chen, Yulun Du, Zhilin Yang

TL;DR
This paper introduces a distribution matching approach for rationalization in text classification, ensuring rationales share similar feature and output distributions with input text, leading to improved performance.
Contribution
It presents a novel distribution matching method that considers both feature and output spaces, addressing limitations of previous mutual information-based approaches.
Findings
Outperforms previous rationalization methods significantly
Effective in aligning rationale and input text distributions
Enhances interpretability and prediction accuracy
Abstract
The task of rationalization aims to extract pieces of input text as rationales to justify neural network predictions on text classification tasks. By definition, rationales represent key text pieces used for prediction and thus should have similar classification feature distribution compared to the original input text. However, previous methods mainly focused on maximizing the mutual information between rationales and labels while neglecting the relationship between rationales and input text. To address this issue, we propose a novel rationalization method that matches the distributions of rationales and input text in both the feature space and output space. Empirically, the proposed distribution matching approach consistently outperforms previous methods by a large margin. Our data and code are available.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsText and Document Classification Technologies · Natural Language Processing Techniques · Advanced Text Analysis Techniques
