Neural Data Augmentation via Example Extrapolation
Kenton Lee, Kelvin Guu, Luheng He, Tim Dozat, Hyung Won Chung

TL;DR
This paper introduces a neural data augmentation method called Example Extrapolation (Ex2) that synthesizes new examples from few-shot samples, improving performance on language understanding tasks with limited data.
Contribution
The paper presents a novel neural approach for data augmentation that extrapolates new examples from few-shot samples, outperforming existing methods on multiple benchmarks.
Findings
Significant improvement on FewRel relation extraction benchmark.
Enhanced accuracy on SNIPS intent classification and slot filling tasks.
Effective augmentation for underrepresented data slices.
Abstract
In many applications of machine learning, certain categories of examples may be underrepresented in the training data, causing systems to underperform on such "few-shot" cases at test time. A common remedy is to perform data augmentation, such as by duplicating underrepresented examples, or heuristically synthesizing new examples. But these remedies often fail to cover the full diversity and complexity of real examples. We propose a data augmentation approach that performs neural Example Extrapolation (Ex2). Given a handful of exemplars sampled from some distribution, Ex2 synthesizes new examples that also belong to the same distribution. The Ex2 model is learned by simulating the example generation procedure on data-rich slices of the data, and it is applied to underrepresented, few-shot slices. We apply Ex2 to a range of language understanding tasks and significantly improve over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
