An Effective, Robust and Fairness-aware Hate Speech Detection Framework

Guanyi Mou; Kyumin Lee

arXiv:2409.17191·cs.CL·September 27, 2024

An Effective, Robust and Fairness-aware Hate Speech Detection Framework

Guanyi Mou, Kyumin Lee

PDF

TL;DR

This paper presents a novel hate speech detection framework that is accurate, robust against attacks, and fair, utilizing data augmentation, uncertainty estimation, and a new neural network architecture, outperforming existing methods.

Contribution

The paper introduces a data-augmented, fairness-aware, and uncertainty-estimated framework with Bidirectional Quaternion-Quasi-LSTM layers for improved hate speech detection.

Findings

01

Outperforms eight state-of-the-art methods

02

Effective under attack and no attack scenarios

03

Generalizes well across multiple datasets

Abstract

With the widespread online social networks, hate speeches are spreading faster and causing more damage than ever before. Existing hate speech detection methods have limitations in several aspects, such as handling data insufficiency, estimating model uncertainty, improving robustness against malicious attacks, and handling unintended bias (i.e., fairness). There is an urgent need for accurate, robust, and fair hate speech classification in online social networks. To bridge the gap, we design a data-augmented, fairness addressed, and uncertainty estimated novel framework. As parts of the framework, we propose Bidirectional Quaternion-Quasi-LSTM layers to balance effectiveness and efficiency. To build a generalized model, we combine five datasets collected from three platforms. Experiment results show that our model outperforms eight state-of-the-art methods under both no attack scenario…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.