HateBuffer: Safeguarding Content Moderators' Mental Well-Being through Hate Speech Content Modification

Subin Park; Jeonghyun Kim; Jeanne Choi; Joseph Seering; Uichin Lee; Sung-Ju Lee

arXiv:2508.00439·cs.HC·August 4, 2025

HateBuffer: Safeguarding Content Moderators' Mental Well-Being through Hate Speech Content Modification

Subin Park, Jeonghyun Kim, Jeanne Choi, Joseph Seering, Uichin Lee, Sung-Ju Lee

PDF

Open Access

TL;DR

HateBuffer is a tool designed to protect content moderators from emotional harm by anonymizing and paraphrasing hate speech, showing promise as a supportive content moderation aid despite mixed measured effects.

Contribution

This paper introduces HateBuffer, a novel text modification system that anonymizes and paraphrases hate speech to support moderators' mental well-being.

Findings

01

Participants rated hate severity lower with HateBuffer.

02

HateBuffer did not significantly reduce emotional fatigue.

03

HateBuffer slightly improved moderation recall.

Abstract

Hate speech remains a persistent and unresolved challenge in online platforms. Content moderators, working on the front lines to review user-generated content and shield viewers from hate speech, often find themselves unprotected from the mental burden as they continuously engage with offensive language. To safeguard moderators' mental well-being, we designed HateBuffer, which anonymizes targets of hate speech, paraphrases offensive expressions into less offensive forms, and shows the original expressions when moderators opt to see them. Our user study with 80 participants consisted of a simulated hate speech moderation task set on a fictional news platform, followed by semi-structured interviews. Although participants rated the hate severity of comments lower while using HateBuffer, contrary to our expectations, they did not experience improved emotion or reduced fatigue compared with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts