NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer   Data Augmentation

Phillip Howard; Gadi Singer; Vasudev Lal; Yejin Choi; Swabha; Swayamdipta

arXiv:2210.12365·cs.CL·October 25, 2022·1 cites

NeuroCounterfactuals: Beyond Minimal-Edit Counterfactuals for Richer Data Augmentation

Phillip Howard, Gadi Singer, Vasudev Lal, Yejin Choi, Swabha, Swayamdipta

PDF

Open Access 1 Repo

TL;DR

NeuroCounterfactuals introduces a generative method for creating rich, naturalistic counterfactual data with larger edits, improving sentiment classification robustness beyond minimal-edit approaches.

Contribution

The paper proposes NeuroCounterfactuals, a novel generative approach that produces diverse, naturalistic counterfactuals with larger edits for data augmentation in NLP.

Findings

01

Outperforms manual counterfactuals in sentiment classification tasks.

02

Enhances in-domain and out-of-domain model robustness.

03

Provides detailed analysis of advantages over minimal-edit methods.

Abstract

While counterfactual data augmentation offers a promising step towards robust generalization in natural language processing, producing a set of counterfactuals that offer valuable inductive bias for models remains a challenge. Most existing approaches for producing counterfactuals, manual or automated, rely on small perturbations via minimal edits, resulting in simplistic changes. We introduce NeuroCounterfactuals, designed as loose counterfactuals, allowing for larger edits which result in naturalistic generations containing linguistic diversity, while still bearing similarity to the original document. Our novel generative approach bridges the benefits of constrained decoding, with those of language model adaptation for sentiment steering. Training data augmentation with our generations results in both in-domain and out-of-domain improvements for sentiment classification, outperforming…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

intellabs/neurocounterfactuals
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsCounterfactuals Explanations