Spectral Learning for Supervised Topic Models

Yong Ren; Yining Wang; Jun Zhu

arXiv:1602.06025·cs.LG·February 22, 2016

Spectral Learning for Supervised Topic Models

Yong Ren, Yining Wang, Jun Zhu

PDF

Open Access

TL;DR

This paper introduces spectral algorithms for supervised LDA that are provably correct, efficient, and outperform existing methods in both synthetic and real-world datasets, including large-scale review data.

Contribution

It develops novel spectral algorithms for supervised LDA, providing theoretical guarantees, sample complexity bounds, and demonstrating superior empirical performance.

Findings

01

Spectral algorithms are provably correct and computationally efficient.

02

The single-phase spectral method achieves comparable or better results than state-of-the-art.

03

Experiments on large-scale datasets validate the practical effectiveness of the proposed methods.

Abstract

Supervised topic models simultaneously model the latent topic structure of large collections of documents and a response variable associated with each document. Existing inference methods are based on variational approximation or Monte Carlo sampling, which often suffers from the local minimum defect. Spectral methods have been applied to learn unsupervised topic models, such as latent Dirichlet allocation (LDA), with provable guarantees. This paper investigates the possibility of applying spectral methods to recover the parameters of supervised LDA (sLDA). We first present a two-stage spectral method, which recovers the parameters of LDA followed by a power update method to recover the regression model parameters. Then, we further present a single-phase spectral algorithm to jointly recover the topic distribution matrix as well as the regression weights. Our spectral algorithms are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Topic Modeling · Domain Adaptation and Few-Shot Learning

MethodsLinear Discriminant Analysis