Learning Topic Models by Neighborhood Aggregation

Ryohei Hisano

arXiv:1802.08012·stat.ML·September 17, 2019

Learning Topic Models by Neighborhood Aggregation

Ryohei Hisano

PDF

TL;DR

This paper introduces a novel perspective on topic modeling by framing it as a neighborhood aggregation process over word networks, enabling easier integration of supervision, embeddings, and nonlinearities, and demonstrating superior classification performance.

Contribution

It presents a network-based approach to topic modeling that simplifies incorporating supervision, pre-trained embeddings, and nonlinear output functions.

Findings

01

Outperforms state-of-the-art supervised LDA in document classification

02

Provides a unified framework for extending topic models with additional signals

03

Demonstrates the effectiveness of neighborhood aggregation in topic modeling

Abstract

Topic models are frequently used in machine learning owing to their high interpretability and modular structure. However, extending a topic model to include a supervisory signal, to incorporate pre-trained word embedding vectors and to include a nonlinear output function is not an easy task because one has to resort to a highly intricate approximate inference procedure. The present paper shows that topic modeling with pre-trained word embedding vectors can be viewed as implementing a neighborhood aggregation algorithm where messages are passed through a network defined over words. From the network view of topic models, nodes correspond to words in a document and edges correspond to either a relationship describing co-occurring words in a document or a relationship describing the same word in the corpus. The network view allows us to extend the model to include supervisory signals,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsInterpretability