ChiEngMixBench: Evaluating Large Language Models on Spontaneous and Natural Chinese-English Code-Mixed Generation

Qingyan Yang; Tongxi Wang; Yunsheng Luo

arXiv:2601.16217·cs.CL·January 26, 2026

ChiEngMixBench: Evaluating Large Language Models on Spontaneous and Natural Chinese-English Code-Mixed Generation

Qingyan Yang, Tongxi Wang, Yunsheng Luo

PDF

Open Access

TL;DR

This paper introduces ChiEngMixBench, a novel benchmark for evaluating the ability of large language models to generate authentic Chinese-English code-mixed language, emphasizing spontaneity and naturalness in real-world contexts.

Contribution

The paper presents the first scalable, community-based benchmark for code-mixing, and reveals an emergent terminology layering strategy aligned with linguistic theory.

Findings

01

Metrics effectively differentiate model performance in code-mixing.

02

Models exhibit an emergent terminology layering strategy.

03

Benchmark enables systematic evaluation across domains.

Abstract

Code-mixing is increasingly prevalent in interactions between humans and large language models, yet existing work often reduces it to a translation or convertibility problem, making it difficult to assess whether a model's switching behavior is context-appropriate and aligned with human conventions. We introduce ChiEngMixBench, the first benchmark designed to evaluate code-mixing ability in authentic community contexts, built upon a general construction pipeline that enables scalable dataset development across domains and bilingual pairs. ChiEngMixBench formulates code-mixing as a cognitive alignment problem, characterized by two complementary signals: Spontaneity and Naturalness. Empirical evaluation shows that our metrics can systematically distinguish code-mixing performance across models. Beyond benchmarking, we further uncover an implicitly emergent Terminology Layering Strategy, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Multilingual Education and Policy · Text Readability and Simplification