Learning Topic Models by Neighborhood Aggregation
Ryohei Hisano

TL;DR
This paper introduces a novel perspective on topic modeling by framing it as a neighborhood aggregation process over word networks, enabling easier integration of supervision, embeddings, and nonlinearities, and demonstrating superior classification performance.
Contribution
It presents a network-based approach to topic modeling that simplifies incorporating supervision, pre-trained embeddings, and nonlinear output functions.
Findings
Outperforms state-of-the-art supervised LDA in document classification
Provides a unified framework for extending topic models with additional signals
Demonstrates the effectiveness of neighborhood aggregation in topic modeling
Abstract
Topic models are frequently used in machine learning owing to their high interpretability and modular structure. However, extending a topic model to include a supervisory signal, to incorporate pre-trained word embedding vectors and to include a nonlinear output function is not an easy task because one has to resort to a highly intricate approximate inference procedure. The present paper shows that topic modeling with pre-trained word embedding vectors can be viewed as implementing a neighborhood aggregation algorithm where messages are passed through a network defined over words. From the network view of topic models, nodes correspond to words in a document and edges correspond to either a relationship describing co-occurring words in a document or a relationship describing the same word in the corpus. The network view allows us to extend the model to include supervisory signals,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsInterpretability
