Consolidating Strategies for Countering Hate Speech Using Persuasive   Dialogues

Sougata Saha; Rohini Srihari

arXiv:2401.07810·cs.CL·January 17, 2024·1 cites

Consolidating Strategies for Countering Hate Speech Using Persuasive Dialogues

Sougata Saha, Rohini Srihari

PDF

Open Access

TL;DR

This paper explores controllable strategies for generating persuasive counter-arguments to hate speech online, aiming for long-term solutions by engaging with perpetrators and reducing harmful rhetoric.

Contribution

It introduces novel methods for controlling counter-argument generation using argument structure, speech acts, and human traits, with models for automatic feature annotation.

Findings

01

Identified effective feature combinations for fluent, logical counter-arguments.

02

Developed models for automatic annotation of argumentative features.

03

Provided a silver-standard annotated hate speech corpus.

Abstract

Hateful comments are prevalent on social media platforms. Although tools for automatically detecting, flagging, and blocking such false, offensive, and harmful content online have lately matured, such reactive and brute force methods alone provide short-term and superficial remedies while the perpetrators persist. With the public availability of large language models which can generate articulate synthetic and engaging content at scale, there are concerns about the rapid growth of dissemination of such malicious content on the web. There is now a need to focus on deeper, long-term solutions that involve engaging with the human perpetrator behind the source of the content to change their viewpoint or at least bring down the rhetoric using persuasive means. To do that, we propose defining and experimenting with controllable strategies for generating counter-arguments to hateful comments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection

MethodsFocus