Hate Speech Detection with Generalizable Target-aware Fairness
Tong Chen, Danny Wang, Xurong Liang, Marten Risius, Gianluca, Demartini, Hongzhi Yin

TL;DR
This paper introduces GetFair, a novel method for hate speech detection that maintains fairness across known and unseen targeted groups by adversarially removing target-related biases using a hypernetwork-based filtering approach.
Contribution
GetFair is the first approach to generalize target-aware fairness in hate speech detection to unseen groups using a hypernetwork to generate adaptive filters.
Findings
GetFair outperforms existing methods on out-of-sample targets.
The hypernetwork effectively generates target-specific filters.
GetFair maintains fairness and accuracy across diverse and unseen groups.
Abstract
To counter the side effect brought by the proliferation of social media platforms, hate speech detection (HSD) plays a vital role in halting the dissemination of toxic online posts at an early stage. However, given the ubiquitous topical communities on social media, a trained HSD classifier easily becomes biased towards specific targeted groups (e.g., female and black people), where a high rate of false positive/negative results can significantly impair public trust in the fairness of content moderation mechanisms, and eventually harm the diversity of online society. Although existing fairness-aware HSD methods can smooth out some discrepancies across targeted groups, they are mostly specific to a narrow selection of targets that are assumed to be known and fixed. This inevitably prevents those methods from generalizing to real-world use cases where new targeted groups constantly emerge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsHyperNetwork
