Spiral of Silence: How Neutral Moderation Polarizes Content Creation
Ying Bao, Jessie Liu

TL;DR
This paper models how neutral content moderation can unintentionally suppress minority content and increase polarization in online environments due to affective polarization and externalities.
Contribution
It introduces a model showing that neutral moderation can lead to self-censorship and polarization, highlighting endogenous bias and the interaction with content personalization.
Findings
Neutral moderation suppresses minority content creation.
Removal of toxic content benefits majority creators but increases polarization.
Content personalization interacts with moderation, affecting content diversity.
Abstract
This paper investigates how content moderation affects content creation in an ideologically diverse online environments. We develop a model in which users act as both creators and consumers, differing in their ideological affiliation and propensity to produce toxic content. Affective polarization, i.e., users' aversion to ideologically opposed content, interacts with moderation in unintended ways. We show that even ideologically neutral moderation that targets only toxicity can suppress non-toxic content creation, particularly from ideological minorities. Our analysis reveals a content-level externality: when toxic content is removed, non-toxic posts gain exposure. While creators from the ideological majority group sometimes benefit from this exposure, they do not internalize the negative spillovers, i.e., increased out-group animosity toward minority creators. This can discourage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Social Media and Politics
