Hierarchical relational models for document networks

Jonathan Chang; David M. Blei

arXiv:0909.4331·stat.AP·October 7, 2010

Hierarchical relational models for document networks

Jonathan Chang, David M. Blei

PDF

TL;DR

The paper introduces the relational topic model (RTM), a hierarchical approach that models both network structure and document content, enabling link prediction and document summarization in large networks.

Contribution

It presents the RTM, a novel hierarchical model that jointly captures network links and document attributes, with scalable inference algorithms for large datasets.

Findings

01

RTM effectively predicts links in large document networks.

02

The model accurately summarizes network structures.

03

RTM outperforms existing methods in link prediction tasks.

Abstract

We develop the relational topic model (RTM), a hierarchical model of both network structure and node attributes. We focus on document networks, where the attributes of each document are its words, that is, discrete observations taken from a fixed vocabulary. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be used to summarize a network of documents, predict links between them, and predict words within them. We derive efficient inference and estimation algorithms based on variational methods that take advantage of sparsity and scale with the number of links. We evaluate the predictive performance of the RTM for large networks of scientific abstracts, web documents, and geographically tagged news.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.