How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities   of Large Language Models

Jiyue Jiang; Pengan Chen; Liheng Chen; Sheng Wang; Qinghang Bao,; Lingpeng Kong; Yu Li; Chuan Wu

arXiv:2408.16756·cs.CL·February 18, 2025

How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities of Large Language Models

Jiyue Jiang, Pengan Chen, Liheng Chen, Sheng Wang, Qinghang Bao,, Lingpeng Kong, Yu Li, Chuan Wu

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper benchmarks the performance of large language models in handling Cantonese, highlighting existing gaps and proposing new evaluation benchmarks to improve Cantonese NLP capabilities.

Contribution

It introduces new benchmarks for evaluating Cantonese LLMs across multiple tasks and discusses strategies to advance open-source Cantonese NLP models.

Findings

01

Cantonese is underrepresented in NLP research.

02

New benchmarks evaluate factual, logical, and reasoning skills in Cantonese.

03

Recommendations for improving Cantonese LLM development.

Abstract

The rapid evolution of large language models (LLMs) has transformed the competitive landscape in natural language processing (NLP), particularly for English and other data-rich languages. However, underrepresented languages like Cantonese, spoken by over 85 million people, face significant development gaps, which is particularly concerning given the economic significance of the Guangdong-Hong Kong-Macau Greater Bay Area, and in substantial Cantonese-speaking populations in places like Singapore and North America. Despite its wide use, Cantonese has scant representation in NLP research, especially compared to other languages from similarly developed regions. To bridge these gaps, we outline current Cantonese NLP methods and introduce new benchmarks designed to evaluate LLM performance in factual generation, mathematical logic, complex reasoning, and general knowledge in Cantonese, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiangjyjy/yue-benchmark
pytorchOfficial

Datasets

BillBao/Yue-Benchmark
dataset· 22 dl
22 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification