Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings
Shujian Yang, Shiyao Cui, Chuanrui Hu, Haicheng Wang, Tianwei Zhang, Minlie Huang, Jialiang Lu, Han Qiu

TL;DR
This paper investigates the challenges of detecting toxic Chinese content with large language models, highlighting multimodal issues, proposing a taxonomy and dataset, benchmarking models, and exploring enhancement methods, revealing limitations and risks of current approaches.
Contribution
It introduces a taxonomy of perturbation strategies, curates a specialized dataset, benchmarks multiple SOTA LLMs, and evaluates cost-effective enhancement techniques for toxic Chinese detection.
Findings
LLMs struggle with perturbed multimodal toxic Chinese content
In-context learning and supervised fine-tuning can cause overcorrection
Benchmark results show significant performance gaps in toxic content detection
Abstract
Detecting toxic content using language models is important but challenging. While large language models (LLMs) have demonstrated strong performance in understanding Chinese, recent studies show that simple character substitutions in toxic Chinese text can easily confuse the state-of-the-art (SOTA) LLMs. In this paper, we highlight the multimodal nature of Chinese language as a key challenge for deploying LLMs in toxic Chinese detection. First, we propose a taxonomy of 3 perturbation strategies and 8 specific approaches in toxic Chinese content. Then, we curate a dataset based on this taxonomy, and benchmark 9 SOTA LLMs (from both the US and China) to assess if they can detect perturbed toxic Chinese text. Additionally, we explore cost-effective enhancement solutions like in-context learning (ICL) and supervised fine-tuning (SFT). Our results reveal two important findings. (1) LLMs are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPlant-based Medicinal Research
MethodsShrink and Fine-Tune
