Exploring the Capability Boundaries of LLMs in Mastering of Chinese Chouxiang Language
Dianqing Lin, Tian Lan, Jiali Zhu, Jiang Li, Wei Chen, Xu Liu, Aruukhan, Xiangdong Su, Hongxu Hou, Guanglai Gao

TL;DR
This paper evaluates large language models' performance on Chouxiang Language, a subcultural Chinese internet language, introducing a new benchmark and analyzing their strengths and limitations across six NLP tasks.
Contribution
It introduces Mouse, a specialized benchmark for Chouxiang Language, and provides insights into LLMs' capabilities and challenges in handling this unique internet language.
Findings
SOTA LLMs show limitations on multiple Chouxiang tasks
LLMs perform well on tasks requiring contextual semantic understanding
Analysis of translation quality and factors affecting LLM performance
Abstract
While large language models (LLMs) have achieved remarkable success in general language tasks, their performance on Chouxiang Language, a representative subcultural language in the Chinese internet context, remains largely unexplored. In this paper, we introduce Mouse, a specialized benchmark designed to evaluate the capabilities of LLMs on NLP tasks involving Chouxiang Language across six tasks. Experimental results show that, current state-of-the-art (SOTA) LLMs exhibit clear limitations on multiple tasks, while performing well on tasks that involve contextual semantic understanding. In addition, we further discuss the reasons behind the generally low performance of SOTA LLMs on Chouxiang Language, examine whether the LLM-as-a-judge approach adopted for translation tasks aligns with human judgments and values, and analyze the key factors that influence Chouxiang translation. Our study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
