An Effective, Robust and Fairness-aware Hate Speech Detection Framework
Guanyi Mou, Kyumin Lee

TL;DR
This paper presents a novel hate speech detection framework that is accurate, robust against attacks, and fair, utilizing data augmentation, uncertainty estimation, and a new neural network architecture, outperforming existing methods.
Contribution
The paper introduces a data-augmented, fairness-aware, and uncertainty-estimated framework with Bidirectional Quaternion-Quasi-LSTM layers for improved hate speech detection.
Findings
Outperforms eight state-of-the-art methods
Effective under attack and no attack scenarios
Generalizes well across multiple datasets
Abstract
With the widespread online social networks, hate speeches are spreading faster and causing more damage than ever before. Existing hate speech detection methods have limitations in several aspects, such as handling data insufficiency, estimating model uncertainty, improving robustness against malicious attacks, and handling unintended bias (i.e., fairness). There is an urgent need for accurate, robust, and fair hate speech classification in online social networks. To bridge the gap, we design a data-augmented, fairness addressed, and uncertainty estimated novel framework. As parts of the framework, we propose Bidirectional Quaternion-Quasi-LSTM layers to balance effectiveness and efficiency. To build a generalized model, we combine five datasets collected from three platforms. Experiment results show that our model outperforms eight state-of-the-art methods under both no attack scenario…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
