Using Contextually Aligned Online Reviews to Measure LLMs' Performance Disparities Across Language Varieties
Zixin Tang, Chieh-Yang Huang, Tsung-Che Li, Ho Yin Sam Ng, Hen-Hsen, Huang, Ting-Hao 'Kenneth' Huang

TL;DR
This paper proposes a cost-effective method using online reviews to evaluate how large language models perform across different language varieties, revealing consistent underperformance in Taiwan Mandarin.
Contribution
It introduces a novel dataset constructed from online reviews to benchmark LLM performance across language varieties, demonstrating disparities in sentiment analysis tasks.
Findings
LLMs underperform in Taiwan Mandarin compared to Mainland Mandarin
Online reviews can effectively serve as data sources for language variety benchmarking
The approach is cost-effective and scalable for diverse language varieties
Abstract
A language can have different varieties. These varieties can affect the performance of natural language processing (NLP) models, including large language models (LLMs), which are often trained on data from widely spoken varieties. This paper introduces a novel and cost-effective approach to benchmark model performance across language varieties. We argue that international online review platforms, such as Booking.com, can serve as effective data sources for constructing datasets that capture comments in different language varieties from similar real-world scenarios, like reviews for the same hotel with the same rating using the same language (e.g., Mandarin Chinese) but different language varieties (e.g., Taiwan Mandarin, Mainland Mandarin). To prove this concept, we constructed a contextually aligned dataset comprising reviews in Taiwan Mandarin and Mainland Mandarin and tested six LLMs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsWikis in Education and Collaboration · Data Mining Algorithms and Applications · Advanced Text Analysis Techniques
