Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation
Letian Peng, Yuwei Zhang, Jingbo Shang

TL;DR
This paper introduces CoTAM, a novel method for controllable data augmentation in few-shot NLP tasks, using chain-of-thought prompting to directly manipulate task-specific attributes in text, improving model performance.
Contribution
We propose Chain-of-Thought Attribute Manipulation (CoTAM), a new approach that directly edits text attributes via chain-of-thought prompting for effective data augmentation.
Findings
CoTAM outperforms other LLM-based augmentation methods across multiple NLP tasks.
Augmented datasets reveal human-recognizable decision boundaries.
The method enhances both fine-tuning and in-context learning performance.
Abstract
Prompting large language models (LLMs) for data augmentation has recently become a common practice in few-shot NLP tasks. In this paper, we propose Chain-of-Thought Attribute Manipulation (CoTAM), a novel approach that generates new data from existing examples by only tweaking in the user-provided, task-specific attribute, e.g., sentiment polarity or topic in movie reviews. Instead of conventional latent representation controlling, we leverage the chain-of-thought prompting to directly edit the text in three steps, (1) attribute decomposition, (2) manipulation proposal, and (3) sentence reconstruction. Extensive results on various tasks, such as text (pair) classification, aspect-based sentiment analysis, and conditional text generation, verify the superiority of CoTAM over other LLM-based augmentation methods with the same number of training examples for both fine-tuning and in-context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
