Text Smoothing: Enhance Various Data Augmentation Methods on Text Classification Tasks
Xing Wu, Chaochen Gao, Meng Lin, Liangjun Zang, Zhongyuan Wang,, Songlin Hu

TL;DR
This paper introduces text smoothing, a data augmentation technique that replaces one-hot token representations with probabilistic smoothed versions from a masked language model, improving text classification performance especially in low-resource settings.
Contribution
The paper proposes a novel text smoothing method that enhances data augmentation by using pre-trained language models to generate more informative token representations.
Findings
Text smoothing outperforms mainstream augmentation methods in low-resource scenarios.
Combining text smoothing with other augmentation methods yields further improvements.
The method is efficient and adaptable across different benchmarks.
Abstract
Before entering the neural network, a token is generally converted to the corresponding one-hot representation, which is a discrete distribution of the vocabulary. Smoothed representation is the probability of candidate tokens obtained from a pre-trained masked language model, which can be seen as a more informative substitution to the one-hot representation. We propose an efficient data augmentation method, termed text smoothing, by converting a sentence from its one-hot representation to a controllable smoothed representation. We evaluate text smoothing on different benchmarks in a low-resource regime. Experimental results show that text smoothing outperforms various mainstream data augmentation methods by a substantial margin. Moreover, text smoothing can be combined with those data augmentation methods to achieve better performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
