Enabling Contextual Soft Moderation on Social Media through Contrastive   Textual Deviation

Pujan Paudel; Mohammad Hammas Saeed; Rebecca Auger; Chris Wells; and; Gianluca Stringhini

arXiv:2407.20910·cs.CL·July 31, 2024

Enabling Contextual Soft Moderation on Social Media through Contrastive Textual Deviation

Pujan Paudel, Mohammad Hammas Saeed, Rebecca Auger, Chris Wells, and, Gianluca Stringhini

PDF

Open Access

TL;DR

This paper introduces Contrastive Textual Deviation (CTD), a novel stance detection method that improves the accuracy of soft moderation systems on social media by significantly reducing false positives.

Contribution

The paper develops CTD, a new textual deviation task, and demonstrates its integration into existing moderation systems to enhance their precision and reliability.

Findings

01

CTD outperforms existing stance detection methods.

02

Integration of CTD reduces false positives from 20% to 2.1%.

03

Improves trustworthiness of automated moderation tools.

Abstract

Automated soft moderation systems are unable to ascertain if a post supports or refutes a false claim, resulting in a large number of contextual false positives. This limits their effectiveness, for example undermining trust in health experts by adding warnings to their posts or resorting to vague warnings instead of granular fact-checks, which result in desensitizing users. In this paper, we propose to incorporate stance detection into existing automated soft-moderation pipelines, with the goal of ruling out contextual false positives and providing more precise recommendations for social media content that should receive warnings. We develop a textual deviation task called Contrastive Textual Deviation (CTD) and show that it outperforms existing stance detection approaches when applied to soft moderation.We then integrate CTD into the stateof-the-art system for automated soft…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Social Media and Politics