Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance
Lucio La Cava, Andrea Tagarelli

TL;DR
This paper evaluates AI agents powered by Large Language Models for automating content compliance in decentralized social media, demonstrating their effectiveness in detecting violations and supporting moderation.
Contribution
It introduces and assesses six LLM-based AI agents tailored for compliance checking across diverse decentralized social network communities.
Findings
AI agents effectively identify non-compliant content
Agents demonstrate high reliability and consistency
Human evaluations confirm usefulness for moderation
Abstract
Ensuring content compliance with community guidelines is crucial for maintaining healthy online social environments. However, traditional human-based compliance checking struggles with scaling due to the increasing volume of user-generated content and a limited number of moderators. Recent advancements in Natural Language Understanding demonstrated by Large Language Models unlock new opportunities for automated content compliance verification. This work evaluates six AI-agents built on Open-LLMs for automated rule compliance checking in Decentralized Social Networks, a challenging environment due to heterogeneous community scopes and rules. Analyzing over 50,000 posts from hundreds of Mastodon servers, we find that AI-agents effectively detect non-compliant content, grasp linguistic subtleties, and adapt to diverse community contexts. Most agents also show high inter-rater reliability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Access Control and Trust · Digital Rights Management and Security
