Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese

Hanjia Lyu; Jiebo Luo; Jian Kang; Allison Koenecke

arXiv:2505.22645·cs.CL·May 29, 2025

Characterizing Bias: Benchmarking Large Language Models in Simplified versus Traditional Chinese

Hanjia Lyu, Jiebo Luo, Jian Kang, Allison Koenecke

PDF

Open Access 1 Repo

TL;DR

This paper benchmarks large language models on tasks involving Simplified and Traditional Chinese, revealing biases influenced by training data and tokenization, and provides an open dataset for future evaluations.

Contribution

It introduces two benchmark tasks to assess LLM biases between Chinese variants and offers an open dataset for reproducible bias analysis.

Findings

01

LLMs show task-dependent biases favoring Simplified or Traditional Chinese.

02

Biases are influenced by training data representation and tokenization differences.

03

Open-sourced benchmark dataset is provided for future research.

Abstract

While the capabilities of Large Language Models (LLMs) have been studied in both Simplified and Traditional Chinese, it is yet unclear whether LLMs exhibit differential performance when prompted in these two variants of written Chinese. This understanding is critical, as disparities in the quality of LLM responses can perpetuate representational harms by ignoring the different cultural contexts underlying Simplified versus Traditional Chinese, and can exacerbate downstream harms in LLM-facilitated decision-making in domains such as education or hiring. To investigate potential LLM performance disparities, we design two benchmark tasks that reflect real-world scenarios: regional term choice (prompting the LLM to name a described item which is referred to differently in Mainland China and Taiwan), and regional name choice (prompting the LLM to choose who to hire from a list of names in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

brucelyu17/sc-tc-bench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Artificial Intelligence in Healthcare and Education · Authorship Attribution and Profiling