The Irrationality of Neural Rationale Models

Yiming Zheng; Serena Booth; Julie Shah; Yilun Zhou

arXiv:2110.07550·cs.CL·July 26, 2022·5 cites

The Irrationality of Neural Rationale Models

Yiming Zheng, Serena Booth, Julie Shah, Yilun Zhou

PDF

Open Access 1 Repo

TL;DR

Neural rationale models, while popular for interpretability in NLP, may not be truly rational or interpretable, highlighting the need for more rigorous evaluation methods.

Contribution

This paper challenges the assumption that rationale models are inherently interpretable, providing philosophical and empirical evidence to question their rationality.

Findings

01

Rationale models may be less rational than assumed

02

Empirical evidence questions the interpretability of rationale models

03

Calls for more rigorous evaluation of interpretability properties

Abstract

Neural rationale models are popular for interpretable predictions of NLP tasks. In these, a selector extracts segments of the input text, called rationales, and passes these segments to a classifier for prediction. Since the rationale is the only information accessible to the classifier, it is plausibly defined as the explanation. Is such a characterization unconditionally correct? In this paper, we argue to the contrary, with both philosophical perspectives and empirical evidence suggesting that rationale models are, perhaps, less rational and interpretable than expected. We call for more rigorous and comprehensive evaluations of these models to ensure desired properties of interpretability are indeed achieved. The code can be found at https://github.com/yimingz89/Neural-Rationale-Analysis.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yimingz89/neural-rationale-analysis
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Topic Modeling