Exploring the Capability Boundaries of LLMs in Mastering of Chinese Chouxiang Language

Dianqing Lin; Tian Lan; Jiali Zhu; Jiang Li; Wei Chen; Xu Liu; Aruukhan; Xiangdong Su; Hongxu Hou; Guanglai Gao

arXiv:2604.15841·cs.CL·April 21, 2026

Exploring the Capability Boundaries of LLMs in Mastering of Chinese Chouxiang Language

Dianqing Lin, Tian Lan, Jiali Zhu, Jiang Li, Wei Chen, Xu Liu, Aruukhan, Xiangdong Su, Hongxu Hou, Guanglai Gao

PDF

1 Repo 1 Datasets

TL;DR

This paper evaluates large language models' performance on Chouxiang Language, a subcultural Chinese internet language, introducing a new benchmark and analyzing their strengths and limitations across six NLP tasks.

Contribution

It introduces Mouse, a specialized benchmark for Chouxiang Language, and provides insights into LLMs' capabilities and challenges in handling this unique internet language.

Findings

01

SOTA LLMs show limitations on multiple Chouxiang tasks

02

LLMs perform well on tasks requiring contextual semantic understanding

03

Analysis of translation quality and factors affecting LLM performance

Abstract

While large language models (LLMs) have achieved remarkable success in general language tasks, their performance on Chouxiang Language, a representative subcultural language in the Chinese internet context, remains largely unexplored. In this paper, we introduce Mouse, a specialized benchmark designed to evaluate the capabilities of LLMs on NLP tasks involving Chouxiang Language across six tasks. Experimental results show that, current state-of-the-art (SOTA) LLMs exhibit clear limitations on multiple tasks, while performing well on tasks that involve contextual semantic understanding. In addition, we further discuss the reasons behind the generally low performance of SOTA LLMs on Chouxiang Language, examine whether the LLM-as-a-judge approach adopted for translation tasks aligns with human judgments and values, and analyze the key factors that influence Chouxiang translation. Our study…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Datasets

ollabalala/MOUSE
dataset· 25 dl
25 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.