Benchmarking Machine Translation on Chinese Social Media Texts

Kaiyan Zhao; Zheyong Xie; Zhongtao Miao; Xinze Lyu; Yao Hu; Shaosheng Cao

arXiv:2601.22931·cs.CL·February 2, 2026

Benchmarking Machine Translation on Chinese Social Media Texts

Kaiyan Zhao, Zheyong Xie, Zhongtao Miao, Xinze Lyu, Yao Hu, Shaosheng Cao

PDF

Open Access

TL;DR

This paper introduces CSM-MTBench, a new benchmark for evaluating Chinese social media machine translation, addressing challenges of slang, neologisms, and stylistic fidelity with specialized subsets and evaluation methods.

Contribution

It presents a comprehensive benchmark with tailored evaluation approaches for social media Chinese translation, focusing on slang, style, and informal expressions, filling a critical gap in MT assessment.

Findings

01

Significant variation in model performance on social media-specific content

02

Traditional metrics often fail to capture stylistic fidelity

03

New benchmark enables better assessment of real-world Chinese social media translation

Abstract

The prevalence of rapidly evolving slang, neologisms, and highly stylized expressions in informal user-generated text, particularly on Chinese social media, poses significant challenges for Machine Translation (MT) benchmarking. Specifically, we identify two primary obstacles: (1) data scarcity, as high-quality parallel data requires bilingual annotators familiar with platform-specific slang, and stylistic cues in both languages; and (2) metric limitations, where traditional evaluators like COMET often fail to capture stylistic fidelity and nonstandard expressions. To bridge these gaps, we introduce CSM-MTBench, a benchmark covering five Chinese-foreign language directions and consisting of two expert-curated subsets: Fun Posts, featuring context-rich, slang- and neologism-heavy content, and Social Snippets, emphasizing concise, emotion- and style- driven expressions. Furthermore, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Sentiment Analysis and Opinion Mining · Hate Speech and Cyberbullying Detection