Learning from Sufficient Rationales: Analysing the Relationship Between Explanation Faithfulness and Token-level Regularisation Strategies

Jonathan Kamp; Lisa Beinborn; Antske Fokkens

arXiv:2511.16353·cs.CL·November 21, 2025

Learning from Sufficient Rationales: Analysing the Relationship Between Explanation Faithfulness and Token-level Regularisation Strategies

Jonathan Kamp, Lisa Beinborn, Antske Fokkens

PDF

Open Access

TL;DR

This paper investigates how explanation sufficiency relates to token-level regularisation strategies and model performance, revealing complex interactions and limitations of current sufficiency metrics in understanding rationale quality.

Contribution

It links sufficiency to token classification and attention regularisation, highlighting their distinct impacts and the complexity of rationale effectiveness in NLP models.

Findings

01

Highly informative rationales do not necessarily improve classification accuracy.

02

Sufficiency captures the impact of non-rationale context, not rationale informativeness.

03

Incorporating rationales can enhance cross-domain performance, but results vary by task and model.

Abstract

Human explanations of natural language, rationales, form a tool to assess whether models learn a label for the right reasons or rely on dataset-specific shortcuts. Sufficiency is a common metric for estimating the informativeness of rationales, but it provides limited insight into the effects of rationale information on model performance. We address this limitation by relating sufficiency to two modelling paradigms: the ability of models to identify which tokens are part of the rationale (through token classification) and the ability of improving model performance by incorporating rationales in the input (through attention regularisation). We find that highly informative rationales are not likely to help classify the instance correctly. Sufficiency conversely captures the classification impact of the non-rationalised context, which interferes with rationale information in the same…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Child and Animal Learning Development