Lost in Pronunciation: Detecting Chinese Offensive Language Disguised by Phonetic Cloaking Replacement

Haotan Guo; Jianfei He; Jiayuan Ma; Hongbin Na; Zimu Wang; Haiyang Zhang; Qi Chen; Wei Wang; Zijing Shi; Tao Shen; Ling Chen

arXiv:2507.07640·cs.CL·July 11, 2025

Lost in Pronunciation: Detecting Chinese Offensive Language Disguised by Phonetic Cloaking Replacement

Haotan Guo, Jianfei He, Jiayuan Ma, Hongbin Na, Zimu Wang, Haiyang Zhang, Qi Chen, Wei Wang, Zijing Shi, Tao Shen, Ling Chen

PDF

Open Access 1 Datasets 1 Video

TL;DR

This paper introduces a taxonomy and dataset for Chinese phonetic cloaking replacement (PCR), revealing current detection methods' limitations and proposing a Pinyin-based prompting strategy to improve robustness in toxicity detection.

Contribution

It provides the first comprehensive taxonomy of Chinese PCR, a realistic benchmark dataset, and a lightweight Pinyin-based prompting technique to enhance detection accuracy.

Findings

01

State-of-the-art models achieve only 0.672 F1-score on PCR detection.

02

Zero-shot chain-of-thought prompting reduces detection performance.

03

Pinyin-based prompting significantly recovers detection accuracy.

Abstract

Phonetic Cloaking Replacement (PCR), defined as the deliberate use of homophonic or near-homophonic variants to hide toxic intent, has become a major obstacle to Chinese content moderation. While this problem is well-recognized, existing evaluations predominantly rely on rule-based, synthetic perturbations that ignore the creativity of real users. We organize PCR into a four-way surface-form taxonomy and compile \ours, a dataset of 500 naturally occurring, phonetically cloaked offensive posts gathered from the RedNote platform. Benchmarking state-of-the-art LLMs on this dataset exposes a serious weakness: the best model reaches only an F1-score of 0.672, and zero-shot chain-of-thought prompting pushes performance even lower. Guided by error analysis, we revisit a Pinyin-based prompting strategy that earlier studies judged ineffective and show that it recovers much of the lost accuracy.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

UTSNLPGroup/PCR-ToxiCN
dataset· 18 dl
18 dl

Videos

Lost in Pronunciation: Detecting Chinese Offensive Language Disguised by Phonetic Cloaking Replacement· underline

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Authorship Attribution and Profiling