Something's Brewing! Early Prediction of Controversy-causing Posts from Discussion Features
Jack Hessel, Lillian Lee

TL;DR
This paper presents a method to predict whether online posts will become controversial by analyzing early discussion features, demonstrating that conversation structure can improve early prediction accuracy across different communities.
Contribution
It introduces a novel approach combining textual and structural discussion features to predict controversy early, with evidence of cross-community generalization.
Findings
Discussion features improve early controversy prediction.
Conversation-structure features generalize better across communities.
Early comments provide significant predictive information.
Abstract
Controversial posts are those that split the preferences of a community, receiving both significant positive and significant negative feedback. Our inclusion of the word "community" here is deliberate: what is controversial to some audiences may not be so to others. Using data from several different communities on reddit.com, we predict the ultimate controversiality of posts, leveraging features drawn from both the textual content and the tree structure of the early comments that initiate the discussion. We find that even when only a handful of comments are available, e.g., the first 5 comments made within 15 minutes of the original post, discussion features often add predictive capacity to strong content-and-rate only baselines. Additional experiments on domain transfer suggest that conversation-structure features often generalize to other communities better than conversation-content…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Misinformation and Its Impacts · Advanced Text Analysis Techniques
