Benchmarking Machine Translation on Chinese Social Media Texts
Kaiyan Zhao, Zheyong Xie, Zhongtao Miao, Xinze Lyu, Yao Hu, Shaosheng Cao

TL;DR
This paper introduces CSM-MTBench, a new benchmark for evaluating Chinese social media machine translation, addressing challenges of slang, neologisms, and stylistic fidelity with specialized subsets and evaluation methods.
Contribution
It presents a comprehensive benchmark with tailored evaluation approaches for social media Chinese translation, focusing on slang, style, and informal expressions, filling a critical gap in MT assessment.
Findings
Significant variation in model performance on social media-specific content
Traditional metrics often fail to capture stylistic fidelity
New benchmark enables better assessment of real-world Chinese social media translation
Abstract
The prevalence of rapidly evolving slang, neologisms, and highly stylized expressions in informal user-generated text, particularly on Chinese social media, poses significant challenges for Machine Translation (MT) benchmarking. Specifically, we identify two primary obstacles: (1) data scarcity, as high-quality parallel data requires bilingual annotators familiar with platform-specific slang, and stylistic cues in both languages; and (2) metric limitations, where traditional evaluators like COMET often fail to capture stylistic fidelity and nonstandard expressions. To bridge these gaps, we introduce CSM-MTBench, a benchmark covering five Chinese-foreign language directions and consisting of two expert-curated subsets: Fun Posts, featuring context-rich, slang- and neologism-heavy content, and Social Snippets, emphasizing concise, emotion- and style- driven expressions. Furthermore, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Sentiment Analysis and Opinion Mining · Hate Speech and Cyberbullying Detection
