Can Self-Censorship in News Media be Detected Algorithmically? A Case Study in Latin America
Rongrong Tao, Baojian Zhou, Feng Chen, Naifeng Liu, David Mares,, Patrick Butler, Naren Ramakrishnan

TL;DR
This paper introduces an unsupervised algorithmic framework that detects self-censorship in traditional news media by comparing it with social media data, successfully identifying censorship events in Latin American countries.
Contribution
It presents a novel hypothesis testing approach and a new near-linear-time algorithm, GraphDPD, for detecting censorship clusters in news media using social media as a sensor.
Findings
Accurately detects censorship events in real-world Latin American data.
Effective in identifying censored clusters with semi-synthetic and real datasets.
Demonstrates the potential of social media as a sensor for traditional media censorship.
Abstract
Censorship in social media has been well studied and provides insight into how governments stifle freedom of expression online. Comparatively less (or no) attention has been paid to detecting (self) censorship in traditional media (e.g., news) using social media as a bellweather. We present a novel unsupervised approach that views social media as a sensor to detect censorship in news media wherein statistically significant differences between information published in the news media and the correlated information published in social media are automatically identified as candidate censored events. We develop a hypothesis testing framework to identify and evaluate censored clusters of keywords, and a new near-linear-time algorithm (called GraphDPD) to identify the highest scoring clusters as indicators of censorship. We outline extensive experiments on semi-synthetic data as well as real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Social Media and Politics · Internet Traffic Analysis and Secure E-voting
