Modeling Online Discourse with Coupled Distributed Topics
Nikita Srivatsan, Zachary Wojtowicz, Taylor Berg-Kirkpatrick

TL;DR
This paper introduces a deep, structured topic model that captures social interactions in online forums, enabling scalable analysis of large datasets like Reddit comments and improving understanding of online discourse.
Contribution
It presents a novel deep, globally normalized topic model that incorporates reply link structures and distributed representations, scalable via GPU-based inference.
Findings
Model effectively captures reply-based interactions in online forums.
Achieves lower perplexity compared to existing models.
Provides insights into social interaction patterns.
Abstract
In this paper, we propose a deep, globally normalized topic model that incorporates structural relationships connecting documents in socially generated corpora, such as online forums. Our model (1) captures discursive interactions along observed reply links in addition to traditional topic information, and (2) incorporates latent distributed representations arranged in a deep architecture, which enables a GPU-based mean-field inference procedure that scales efficiently to large data. We apply our model to a new social media dataset consisting of 13M comments mined from the popular internet forum Reddit, a domain that poses significant challenges to models that do not account for relationships connecting user comments. We evaluate against existing methods across multiple metrics including perplexity and metadata prediction, and qualitatively analyze the learned interaction patterns.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques
