Privacy-Preserving Online Content Moderation: A Federated Learning Use Case
Pantelitsa Leonidou, Nicolas Kourtellis, Nikos Salamanos, Michael, Sirivianos

TL;DR
This paper presents a federated learning framework with differential privacy for online content moderation, demonstrating high performance and low overhead in detecting harmful Twitter content while preserving user privacy.
Contribution
It introduces a privacy-preserving federated learning approach with differential privacy for content moderation, showing near-centralized performance and robustness with limited clients and data.
Findings
Achieves ~81% AUC with fewer clients and data points.
Maintains high performance across multiple Twitter datasets.
Local training incurs minimal CPU and memory overhead.
Abstract
Users are daily exposed to a large volume of harmful content on various social network platforms. One solution is developing online moderation tools using Machine Learning techniques. However, the processing of user data by online platforms requires compliance with privacy policies. Federated Learning (FL) is an ML paradigm where the training is performed locally on the users' devices. Although the FL framework complies, in theory, with the GDPR policies, privacy leaks can still occur. For instance, an attacker accessing the final trained model can successfully perform unwanted inference of the data belonging to the users who participated in the training process. In this paper, we propose a privacy-preserving FL framework for online content moderation that incorporates Differential Privacy (DP). To demonstrate the feasibility of our approach, we focus on detecting harmful content on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting · Adversarial Robustness in Machine Learning
