Prior-aware Dual Decomposition: Document-specific Topic Inference for   Spectral Topic Models

Moontae Lee; David Bindel; David Mimno

arXiv:1711.07065·cs.CL·November 21, 2017

Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models

Moontae Lee, David Bindel, David Mimno

PDF

Open Access

TL;DR

This paper introduces PADD, a novel method for document-specific topic inference in spectral topic models that leverages prior information to improve accuracy and efficiency over existing linear methods.

Contribution

The paper proposes PADD, a prior-aware dual decomposition approach that enhances spectral topic models by enabling parallel, document-specific topic inference with improved quality.

Findings

01

PADD outperforms TLI in topic composition accuracy.

02

PADD achieves comparable results to Gibbs sampling.

03

PADD effectively leverages topic correlations as priors.

Abstract

Spectral topic modeling algorithms operate on matrices/tensors of word co-occurrence statistics to learn topic-specific word distributions. This approach removes the dependence on the original documents and produces substantial gains in efficiency and provable topic inference, but at a cost: the model can no longer provide information about the topic composition of individual documents. Recently Thresholded Linear Inverse (TLI) is proposed to map the observed words of each document back to its topic composition. However, its linear characteristics limit the inference quality without considering the important prior information over topics. In this paper, we evaluate Simple Probabilistic Inverse (SPI) method and novel Prior-aware Dual Decomposition (PADD) that is capable of learning document-specific topic compositions in parallel. Experiments show that PADD successfully leverages topic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis