Data Augmentation for Text Generation Without Any Augmented Data
Wei Bi, Huayang Li, Jiacheng Huang

TL;DR
This paper introduces a novel data augmentation approach for text generation that does not require any augmented data or specific mapping functions, improving performance efficiently.
Contribution
It formulates an objective for data augmentation in text generation without using augmented data, enabling effective optimization and broader applicability.
Findings
Outperforms existing data augmentation methods on multiple datasets
Efficient optimization with convergence guarantees
Applicable to various text generation tasks
Abstract
Data augmentation is an effective way to improve the performance of many neural text generation models. However, current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples. In this work, we derive an objective to formulate the problem of data augmentation on text generation tasks without any use of augmented data constructed by specific mapping functions. Our proposed objective can be efficiently optimized and applied to popular loss functions on text generation tasks with a convergence rate guarantee. Experiments on five datasets of two text generation tasks show that our approach can approximate or even surpass popular data augmentation methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
