Explanation Regularisation through the Lens of Attributions
Pedro Ferreira, Ivan Titov, Wilker Aziz

TL;DR
This paper investigates explanation regularisation (ER) in text classifiers, revealing that increased reliance on plausible tokens does not necessarily improve out-of-domain performance, challenging previous assumptions about ER's benefits.
Contribution
The study critically examines the relationship between ER, reliance on plausible features, and OOD performance, highlighting that stronger reliance on plausible tokens is not the main factor for OOD improvements.
Findings
Stronger reliance on plausible tokens does not correlate with better OOD performance.
The connection between ER guidance and reliance on plausible features has been overstated.
ER's benefits in OOD settings may not stem from increased reliance on human-annotated rationales.
Abstract
Explanation regularisation (ER) has been introduced as a way to guide text classifiers to form their predictions relying on input tokens that humans consider plausible. This is achieved by introducing an auxiliary explanation loss that measures how well the output of an input attribution technique for the model agrees with human-annotated rationales. The guidance appears to benefit performance in out-of-domain (OOD) settings, presumably due to an increased reliance on "plausible" tokens. However, previous work has under-explored the impact of guidance on that reliance, particularly when reliance is measured using attribution techniques different from those used to guide the model. In this work, we seek to close this gap, and also explore the relationship between reliance on plausible features and OOD performance. We find that the connection between ER and the ability of a classifier to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSeismology and Earthquake Studies
