Facilitating Fine-grained Detection of Chinese Toxic Language: Hierarchical Taxonomy, Resources, and Benchmarks
Junyu Lu, Bo Xu, Xiaokun Zhang, Changrong Min, Liang Yang, Hongfei Lin

TL;DR
This paper introduces a hierarchical taxonomy, a new fine-grained dataset, and a lexical knowledge-enhanced benchmark to improve Chinese toxic language detection, addressing previous limitations in annotation detail and indirect toxicity detection.
Contribution
It presents a hierarchical taxonomy, a comprehensive dataset with direct and indirect toxicity, and a lexical knowledge-based benchmark for more effective Chinese toxic language detection.
Findings
TKE outperforms baseline models in detection accuracy
Lexical features significantly improve toxicity classification
Fine-grained annotations enable better understanding of toxic expressions
Abstract
The widespread dissemination of toxic online posts is increasingly damaging to society. However, research on detecting toxic language in Chinese has lagged significantly. Existing datasets lack fine-grained annotation of toxic types and expressions, and ignore the samples with indirect toxicity. In addition, it is crucial to introduce lexical knowledge to detect the toxicity of posts, which has been a challenge for researchers. In this paper, we facilitate the fine-grained detection of Chinese toxic language. First, we built Monitor Toxic Frame, a hierarchical taxonomy to analyze toxic types and expressions. Then, a fine-grained dataset ToxiCN is presented, including both direct and indirect toxic samples. We also build an insult lexicon containing implicit profanity and propose Toxic Knowledge Enhancement (TKE) as a benchmark, incorporating the lexical feature to detect toxic language.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques
