Extending Czech Aspect-Based Sentiment Analysis with Opinion Terms: Dataset and LLM Benchmarks
Jakub \v{S}m\'id, Pavel P\v{r}ib\'a\v{n}, Pavel Kr\'al

TL;DR
This paper presents a new Czech dataset for aspect-based sentiment analysis with opinion annotations, benchmarks Transformer models including LLMs, and proposes a translation-alignment method to improve cross-lingual performance in low-resource settings.
Contribution
It introduces a novel Czech ABSA dataset with opinion annotations and develops a translation-alignment approach for low-resource language adaptation.
Findings
Transformer models perform variably on Czech ABSA tasks.
The translation-alignment method improves cross-lingual model performance.
Error analysis highlights challenges in detecting subtle opinion terms.
Abstract
This paper introduces a novel Czech dataset in the restaurant domain for aspect-based sentiment analysis (ABSA), enriched with annotations of opinion terms. The dataset supports three distinct ABSA tasks involving opinion terms, accommodating varying levels of complexity. Leveraging this dataset, we conduct extensive experiments using modern Transformer-based models, including large language models (LLMs), in monolingual, cross-lingual, and multilingual settings. To address cross-lingual challenges, we propose a translation and label alignment methodology leveraging LLMs, which yields consistent improvements. Our results highlight the strengths and limitations of state-of-the-art models, especially when handling the linguistic intricacies of low-resource languages like Czech. A detailed error analysis reveals key challenges, including the detection of subtle opinion terms and nuanced…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Hate Speech and Cyberbullying Detection
