Label-Consistent Data Generation for Aspect-Based Sentiment Analysis Using LLM Agents
Mohammad H.A. Monfared, Lucie Flek, Akbar Karimi

TL;DR
This paper introduces an agentic data augmentation approach for Aspect-Based Sentiment Analysis that iteratively generates and verifies synthetic data, improving label accuracy and model performance over traditional prompting methods.
Contribution
The paper presents a novel agentic augmentation method that outperforms prompting-based baselines in generating high-quality, label-consistent synthetic data for ABSA tasks.
Findings
Agentic augmentation yields higher label preservation than prompting.
Combining augmented data with real data improves model performance.
Method benefits are more significant with less pretrained models.
Abstract
We propose an agentic data augmentation method for Aspect-Based Sentiment Analysis (ABSA) that uses iterative generation and verification to produce high quality synthetic training examples. To isolate the effect of agentic structure, we also develop a closely matched prompting-based baseline using the same model and instructions. Both methods are evaluated across three ABSA subtasks (Aspect Term Extraction (ATE), Aspect Sentiment Classification (ATSC), and Aspect Sentiment Pair Extraction (ASPE)), four SemEval datasets, and two encoder-decoder models: T5-Base and Tk-Instruct. Our results show that the agentic augmentation outperforms raw prompting in label preservation of the augmented data, especially when the tasks require aspect term generation. In addition, when combined with real data, agentic augmentation provides higher gains, consistently outperforming prompting-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Emotion and Mood Recognition · Topic Modeling
