How do you feel? Measuring User-Perceived Value for Rejecting Machine Decisions in Hate Speech Detection
Philippe Lammerts, Philip Lippmann, Yen-Chia Hsu, Fabio Casati, and, Jie Yang

TL;DR
This paper introduces a value-sensitive rejection mechanism for hate speech detection systems that considers user perceptions to improve human-AI collaboration, using Magnitude Estimation to measure user agreement and guide decision-making.
Contribution
It proposes a novel rejection mechanism based on user-perceived value and introduces Magnitude Estimation as a measurement tool for user agreement in hate speech detection.
Findings
Magnitude Estimation reliably measures user perception.
User-perceived value guides optimal rejection of machine decisions.
Model selection based on user perception outperforms accuracy-based selection.
Abstract
Hate speech moderation remains a challenging task for social media platforms. Human-AI collaborative systems offer the potential to combine the strengths of humans' reliability and the scalability of machine learning to tackle this issue effectively. While methods for task handover in human-AI collaboration exist that consider the costs of incorrect predictions, insufficient attention has been paid to accurately estimating these costs. In this work, we propose a value-sensitive rejection mechanism that automatically rejects machine decisions for human moderation based on users' value perceptions regarding machine decisions. We conduct a crowdsourced survey study with 160 participants to evaluate their perception of correct and incorrect machine decisions in the domain of hate speech detection, as well as occurrences where the system rejects making a prediction. Here, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
