Look behind the Censorship: Reposting-User Characterization and Muted-Topic Restoration
Yichi Qian, Qiyi Shan, Hanjia Lyu, Jiebo Luo

TL;DR
This paper investigates censorship on Weibo by analyzing repost comments to characterize users and restore muted topics, revealing discussion patterns and user differences despite content removal.
Contribution
It introduces a web-scraping pipeline for repost data, characterizes censored users, and develops a thematic analysis method to infer original topics from repost comments.
Findings
Recovered discussions on muted social events.
Identified user characteristics related to censorship.
Analyzed topic variations across user groups and time.
Abstract
The emergence of social media has largely eased the way people receive information and participate in public discussions. However, in countries with strict regulations on discussions in the public space, social media is no exception. To limit the degree of dissent or inhibit the spread of "harmful" information, a common approach is to impose information operations such as censorship/suspension on social media. In this paper, we focus on a study of censorship on Weibo, the counterpart of Twitter in China. Specifically, we 1) create a web-scraping pipeline and collect a large dataset solely focus on the reposts from Weibo; 2) discover the characteristics of users whose reposts contain censored information, in terms of gender, device, and account type; and 3) conduct a thematic analysis by extracting and analyzing topic information. Note that although the original posts are no longer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSocial Media and Politics · Misinformation and Its Impacts · Hate Speech and Cyberbullying Detection
