AI Feedback Enhances Community-Based Content Moderation through Engagement with Counterarguments
Saeedeh Mohammadi, Taha Yasseri

TL;DR
This study demonstrates that AI-generated feedback, especially argumentative, improves the quality of community-based content moderation notes, emphasizing diverse perspectives and human-AI collaboration.
Contribution
It introduces a hybrid moderation framework using AI feedback to enhance note quality and addresses challenges like bias and delays in community moderation.
Findings
AI feedback improves note quality significantly.
Argumentative feedback yields the greatest improvements.
Engagement with diverse perspectives enhances collective intelligence.
Abstract
Today, social media platforms are significant sources of news and political communication, but their role in spreading misinformation has raised significant concerns. In response, these platforms have implemented various content moderation strategies. One such method, Community Notes (formerly Birdwatch) on X (formerly Twitter), relies on crowdsourced fact-checking and has gained traction. However, it faces challenges such as partisan bias and delays in verification. This study explores an AI-assisted hybrid moderation framework in which participants receive AI-generated feedback, supportive, neutral, or argumentative, on their notes and are asked to revise them accordingly. The results show that incorporating feedback improves note quality, with the most substantial gains coming from argumentative feedback. This underscores the value of diverse perspectives and direct engagement in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
