Unsupervised Data Augmentation with Naive Augmentation and without Unlabeled Data
David Lowell, Brian E. Howard, Zachary C. Lipton, Byron C. Wallace

TL;DR
This paper empirically re-examines Unsupervised Data Augmentation (UDA) in NLP, showing its effectiveness even without unlabeled data and that simple augmentation can match complex methods.
Contribution
It demonstrates that UDA's benefits can be achieved with simple augmentation and without unlabeled data, challenging prior assumptions about its complexity and unsupervised nature.
Findings
Simple augmentation often matches complex methods.
UDA improves performance even without unlabeled data.
Enforcing consistency is key to UDA's success.
Abstract
Unsupervised Data Augmentation (UDA) is a semi-supervised technique that applies a consistency loss to penalize differences between a model's predictions on (a) observed (unlabeled) examples; and (b) corresponding 'noised' examples produced via data augmentation. While UDA has gained popularity for text classification, open questions linger over which design decisions are necessary and over how to extend the method to sequence labeling tasks. This method has recently gained traction for text classification. In this paper, we re-examine UDA and demonstrate its efficacy on several sequential tasks. Our main contribution is an empirical study of UDA to establish which components of the algorithm confer benefits in NLP. Notably, although prior work has emphasized the use of clever augmentation techniques including back-translation, we find that enforcing consistency between predictions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
