Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate

Mikel K. Ngueajio; Flor Miriam Plaza-del-Arco; Yi-Ling Chung; Danda B. Rawat; Amanda Cercas Curry

arXiv:2506.04043·cs.CL·June 5, 2025

Think Like a Person Before Responding: A Multi-Faceted Evaluation of Persona-Guided LLMs for Countering Hate

Mikel K. Ngueajio, Flor Miriam Plaza-del-Arco, Yi-Ling Chung, Danda B. Rawat, Amanda Cercas Curry

PDF

Open Access 1 Repo

TL;DR

This paper evaluates how different prompting strategies influence the quality, tone, and ethical safety of counter-narratives generated by large language models to combat online hate speech.

Contribution

It introduces a comprehensive framework for assessing LLM-generated counter-narratives across multiple dimensions including persona, readability, tone, and ethics.

Findings

01

Emotionally guided prompts produce more empathetic responses.

02

LLM-generated counter-narratives tend to be verbose and college-level in readability.

03

Safety and effectiveness concerns remain despite improved tone.

Abstract

Automated counter-narratives (CN) offer a promising strategy for mitigating online hate speech, yet concerns about their affective tone, accessibility, and ethical risks remain. We propose a framework for evaluating Large Language Model (LLM)-generated CNs across four dimensions: persona framing, verbosity and readability, affective tone, and ethical robustness. Using GPT-4o-Mini, Cohere's CommandR-7B, and Meta's LLaMA 3.1-70B, we assess three prompting strategies on the MT-Conan and HatEval datasets. Our findings reveal that LLM-generated CNs are often verbose and adapted for people with college-level literacy, limiting their accessibility. While emotionally guided prompts yield more empathetic and readable responses, there remain concerns surrounding safety and effectiveness.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mikelkn/woah-2025
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Mental Health via Writing · Persona Design and Applications

MethodsLLaMA