TL;DR
This paper models the formation and evolution of online discussion threads using a self-exciting Hawkes process, enabling better prediction of future activity and discussion size on platforms like Reddit.
Contribution
It introduces a dynamic model for discussion trees that captures their formation and predicts future discussion growth, outperforming previous static approaches.
Findings
Discussion trees resemble Galton-Watson processes.
The Hawkes process model accurately predicts discussion size.
The approach outperforms previous prediction methods.
Abstract
Internet boards are platforms for online discussions about a variety of topics. On these boards, individuals may start a new thread on a specific matter, or leave comments in an existing discussion. The resulting collective process leads to the formation of `discussion trees', where nodes represent a post and comments, and an edge represents a `reply-to' relation. The structure of discussion trees has been analysed in previous works, but only from a static perspective. In this paper, we focus on their structural and dynamical properties by modelling their formation as a self-exciting Hawkes process. We first study a Reddit dataset to show that the structure of the trees resemble those produced by a Galton-Watson process with a special root offspring distribution. The dynamical aspect of the model is then used to predict future commenting activity and the final size of a discussion tree.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
