Learning the Difference that Makes a Difference with   Counterfactually-Augmented Data

Divyansh Kaushik; Eduard Hovy; Zachary C. Lipton

arXiv:1909.12434·cs.CL·February 18, 2020·232 cites

Learning the Difference that Makes a Difference with Counterfactually-Augmented Data

Divyansh Kaushik, Eduard Hovy, Zachary C. Lipton

PDF

Open Access 2 Repos

TL;DR

This paper introduces counterfactually-augmented data for NLP, enabling models to become less sensitive to spurious patterns by training on both original and human-revised counterfactual examples, improving robustness.

Contribution

It presents methods and resources for creating counterfactually-revised datasets in NLP, reducing models' reliance on spurious correlations and enhancing their causal robustness.

Findings

01

Models trained on combined data perform nearly as well as specialized models.

02

Training on combined data reduces sensitivity to spurious features.

03

Counterfactually-revised data improves model robustness against confounding factors.

Abstract

Despite alarm over the reliance of machine learning systems on so-called spurious patterns, the term lacks coherent meaning in standard statistical frameworks. However, the language of causality offers clarity: spurious associations are due to confounding (e.g., a common cause), but not direct or indirect causal effects. In this paper, we focus on natural language processing, introducing methods and resources for training models less sensitive to spurious patterns. Given documents and their initial labels, we task humans with revising each document so that it (i) accords with a counterfactual target label; (ii) retains internal coherence; and (iii) avoids unnecessary changes. Interestingly, on sentiment analysis and natural language inference tasks, classifiers trained on original data fail on their counterfactually-revised counterparts and vice versa. Classifiers trained on combined…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Topic Modeling · Machine Learning in Healthcare