Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning

Jorge Paz-Ruza; Amparo Alonso-Betanzos; Bertha Guijarro-Berdi\~nas; Carlos Eiras-Franco

arXiv:2505.17068·cs.CL·May 26, 2025

Predictively Combatting Toxicity in Health-related Online Discussions through Machine Learning

Jorge Paz-Ruza, Amparo Alonso-Betanzos, Bertha Guijarro-Berdi\~nas, Carlos Eiras-Franco

PDF

TL;DR

This paper introduces a machine learning approach that predicts potential toxicity in health-related online discussions, enabling proactive moderation to prevent conflicts before they occur.

Contribution

It presents a collaborative filtering-based method to predict toxicity in COVID-related Reddit discussions, surpassing 80% accuracy and enabling preemptive moderation.

Findings

01

Achieved over 80% predictive performance in toxicity detection

02

Enabled prevention of toxic user interactions in online health discussions

03

Demonstrated effectiveness on Reddit COVID conversations

Abstract

In health-related topics, user toxicity in online discussions frequently becomes a source of social conflict or promotion of dangerous, unscientific behaviour; common approaches for battling it include different forms of detection, flagging and/or removal of existing toxic comments, which is often counterproductive for platforms and users alike. In this work, we propose the alternative of combatting user toxicity predictively, anticipating where a user could interact toxically in health-related online discussions. Applying a Collaborative Filtering-based Machine Learning methodology, we predict the toxicity in COVID-related conversations between any user and subcommunity of Reddit, surpassing 80% predictive performance in relevant metrics, and allowing us to prevent the pairing of conflicting users and subcommunities.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.