LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?
Jan Cegin, Jakub Simko, Peter Brusilovsky

TL;DR
This study compares large language model-based text augmentation with traditional methods across multiple datasets and classifiers, revealing that LLMs are beneficial mainly with very few seeds and often do not outperform established techniques.
Contribution
The paper provides a comprehensive comparison of LLM-based and traditional text augmentation methods, including cost-benefit analysis across various settings.
Findings
LLM augmentation is advantageous only with very few seeds.
Established methods often achieve similar or better accuracy.
Cost-benefit analysis favors LLMs primarily in low-seed scenarios.
Abstract
The generative large language models (LLMs) are increasingly being used for data augmentation tasks, where text samples are LLM-paraphrased and then used for classifier fine-tuning. However, a research that would confirm a clear cost-benefit advantage of LLMs over more established augmentation methods is largely missing. To study if (and when) is the LLM-based augmentation advantageous, we compared the effects of recent LLM augmentation methods with established ones on 6 datasets, 3 classifiers and 2 fine-tuning methods. We also varied the number of seeds and collected samples to better explore the downstream model accuracy space. Finally, we performed a cost-benefit analysis and show that LLM-based methods are worthy of deployment only when very small number of seeds is used. Moreover, in many cases, established methods lead to similar or better model accuracies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques
