SDA: Improving Text Generation with Self Data Augmentation
Ping Yu, Ruiyi Zhang, Yang Zhao, Yizhe Zhang, Chunyuan Li, Changyou, Chen

TL;DR
This paper introduces a self data augmentation technique for text generation that enhances maximum likelihood estimation by automatically generating training data, leading to improved performance across multiple datasets.
Contribution
It presents a novel self-imitation-learning framework for data augmentation in text generation, adaptable to any MLE-based training, and capable of incorporating task-specific evaluation metrics.
Findings
Significant performance improvements on synthetic and real datasets.
Enhanced control over generated sentence quality and diversity.
Outperforms existing augmentation methods in text generation tasks.
Abstract
Data augmentation has been widely used to improve deep neural networks in many research fields, such as computer vision. However, less work has been done in the context of text, partially due to its discrete nature and the complexity of natural languages. In this paper, we propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation. Unlike most existing sentence-level augmentation strategies, which are only applied to specific models, our method is more general and could be easily adapted to any MLE-based training procedure. In addition, our framework allows task-specific evaluation metrics to be designed to flexibly control the generated sentences, for example, in terms of controlling vocabulary usage and avoiding nontrivial repetitions. Extensive experimental results demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
