"Newspaper Eat" Means "Not Tasty": A Taxonomy and Benchmark for Coded Language in Real-World Chinese Online Reviews

Ruyuan Wan; Changye Li; Ting-Hao 'Kenneth' Huang

arXiv:2601.19932·cs.CL·April 23, 2026

"Newspaper Eat" Means "Not Tasty": A Taxonomy and Benchmark for Coded Language in Real-World Chinese Online Reviews

Ruyuan Wan, Changye Li, Ting-Hao 'Kenneth' Huang

PDF

2 Repos

TL;DR

This paper presents CodedLang, a Chinese review dataset with annotations and a taxonomy for coded language, highlighting challenges for NLP models in decoding such language.

Contribution

Introduces a new dataset and taxonomy for coded language in Chinese reviews, and benchmarks models' ability to detect and interpret coded expressions.

Findings

01

Models struggle to identify coded language accurately.

02

Phonetic strategies are common in coded expressions.

03

Coded language poses significant challenges for NLP systems.

Abstract

Coded language is an important part of human communication. It refers to cases where users intentionally encode meaning so that the surface text differs from the intended meaning and must be decoded to be understood. Current language models handle coded language poorly. Progress has been limited by the lack of real-world datasets and clear taxonomies. This paper introduces CodedLang, a dataset of 7,744 Chinese Google Maps reviews, including 900 reviews with span-level annotations of coded language. We developed a seven-class taxonomy that captures common encoding strategies, including phonetic, orthographic, and cross-lingual substitutions. We benchmarked language models on coded language detection, classification, and review rating prediction. Results show that even strong models can fail to identify or understand coded language. Because many coded expressions rely on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.