Large-Scale Aspect-Based Sentiment Analysis with Reasoning-Infused LLMs
Pawe{\l} Liskowski, Krzysztof Jankowski

TL;DR
This paper presents Arctic-ABSA, a large-scale, multilingual aspect-based sentiment analysis framework that leverages reasoning techniques and synthetic data to outperform existing models and set new benchmarks.
Contribution
It introduces Arctic-ABSA, a novel large-scale, multilingual ABSA model with reasoning infusion and expanded sentiment classes, achieving state-of-the-art results.
Findings
Models outperform GPT-4o and Claude 3.5 Sonnet by up to 10% accuracy.
Achieves 87-91% accuracy across six languages.
Sets new state-of-the-art on SemEval14 benchmark.
Abstract
We introduce Arctic-ABSA, a collection of powerful models for real-life aspect-based sentiment analysis (ABSA). Our models are tailored to commercial needs, trained on a large corpus of public data alongside carefully generated synthetic data, resulting in a dataset 20 times larger than SemEval14. We extend typical ABSA models by expanding the number of sentiment classes from the standard three (positive, negative, neutral) to five, adding mixed and unknown classes, while also jointly predicting overall text sentiment and supporting multiple languages. We experiment with reasoning injection by fine-tuning on Chain-of-Thought (CoT) examples and introduce a novel reasoning pretraining technique for encoder-only models that significantly improves downstream fine-tuning and generalization. Our 395M-parameter encoder and 8B-parameter decoder achieve up to 10 percentage points higher accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Text and Document Classification Technologies
