Collaborative Content Moderation in the Fediverse
Haris Bin Zia, Aravindh Raman, Ignacio Castro, Gareth Tyson

TL;DR
This paper introduces FedMod, a federated learning-based system for collaborative content moderation in the Fediverse, enabling servers to share moderation models and improve detection of harmful content, bots, and content warnings.
Contribution
The paper presents FedMod, a novel federated learning approach tailored for decentralized content moderation, addressing resource constraints and privacy concerns in the Fediverse.
Findings
FedMod achieves macro-F1 scores of 0.71, 0.73, and 0.58 on three moderation tasks.
The system demonstrates robust performance across different content moderation challenges.
Federated learning enables collaborative model improvement without centralized data collection.
Abstract
The Fediverse, a group of interconnected servers providing a variety of interoperable services (e.g. micro-blogging in Mastodon) has gained rapid popularity. This sudden growth, partly driven by Elon Musk's acquisition of Twitter, has created challenges for administrators though. This paper focuses on one particular challenge: content moderation, e.g. the need to remove spam or hate speech. While centralized platforms like Facebook and Twitter rely on automated tools for moderation, their dependence on massive labeled datasets and specialized infrastructure renders them impractical for decentralized, low-resource settings like the Fediverse. In this work, we design and evaluate FedMod, a collaborative content moderation system based on federated learning. Our system enables servers to exchange parameters of partially trained local content moderation models with similar servers, creating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Digital Rights Management and Security · Advanced Malware Detection Techniques
