ToXCL: A Unified Framework for Toxic Speech Detection and Explanation
Nhat M. Hoang, Xuan Long Do, Duc Anh Do, Duc Anh Vu, Luu Anh Tuan

TL;DR
ToXCL is a unified framework that effectively detects and explains implicit toxic speech by integrating targeted group identification, a boosted encoder-decoder model, and knowledge distillation, achieving state-of-the-art results.
Contribution
It introduces a novel unified model combining detection and explanation of implicit toxic speech with a target group generator and knowledge distillation, improving over prior text generation approaches.
Findings
Achieves state-of-the-art detection and explanation performance.
Outperforms baseline models significantly.
Effectively handles implicit toxic speech detection and explanation.
Abstract
The proliferation of online toxic speech is a pertinent problem posing threats to demographic groups. While explicit toxic speech contains offensive lexical signals, implicit one consists of coded or indirect language. Therefore, it is crucial for models not only to detect implicit toxic speech but also to explain its toxicity. This draws a unique need for unified frameworks that can effectively detect and explain implicit toxic speech. Prior works mainly formulated the task of toxic speech detection and explanation as a text generation problem. Nonetheless, models trained using this strategy can be prone to suffer from the consequent error propagation problem. Moreover, our experiments reveal that the detection results of such models are much lower than those that focus only on the detection task. To bridge these gaps, we introduce ToXCL, a unified framework for the detection and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques
MethodsFocus
