LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation

Jakub \v{S}m\'id; Pavel P\v{r}ib\'a\v{n}; Pavel Kr\'al

arXiv:2508.09515·cs.CL·August 14, 2025

LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation

Jakub \v{S}m\'id, Pavel P\v{r}ib\'a\v{n}, Pavel Kr\'al

PDF

1 Video

TL;DR

This paper introduces a novel LLM-based data augmentation approach for cross-lingual aspect-based sentiment analysis, eliminating reliance on translation tools and improving performance across multiple languages.

Contribution

The paper presents a new LLM-driven pseudo-labelling method that enhances cross-lingual ABSA without translation, outperforming existing translation-based techniques.

Findings

01

Outperforms previous translation-based methods in six languages

02

Effective across five backbone models

03

Fine-tuned LLMs outperform smaller multilingual models

Abstract

Cross-lingual aspect-based sentiment analysis (ABSA) involves detailed sentiment analysis in a target language by transferring knowledge from a source language with available annotated data. Most existing methods depend heavily on often unreliable translation tools to bridge the language gap. In this paper, we propose a new approach that leverages a large language model (LLM) to generate high-quality pseudo-labelled data in the target language without the need for translation tools. First, the framework trains an ABSA model to obtain predictions for unlabelled target language data. Next, LLM is prompted to generate natural sentences that better represent these noisy predictions than the original text. The ABSA model is then further fine-tuned on the resulting pseudo-labelled dataset. We demonstrate the effectiveness of this method across six languages and five backbone models,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation· underline